sentiment analysis

16
Sentiment Analysis from Cellphone Reviews Sagar Ahire | 155 Preeti Singh | 178

Upload: sagar-ahire

Post on 16-Dec-2014

646 views

Category:

Documents


3 download

DESCRIPTION

Slides for my college project on Sentiment Analysis of cellphone reviews. System is to be made in Python and uses NLTK.

TRANSCRIPT

Page 1: Sentiment Analysis

Sentiment Analysis from Cellphone Reviews

Sagar Ahire | 155Preeti Singh | 178

Page 2: Sentiment Analysis

What is Sentiment Analysis?

• Takes a block of text as input• Determines the sentiment expressed in it• “Sentiment” refers to whether the author’s

opinion is positive or negative

Page 3: Sentiment Analysis

Disciplines Involved

• Natural Language Processing• Data Mining• Artificial Intelligence

Page 4: Sentiment Analysis

What Sentiment Analysis is NOT

• Does NOT use images anywhere (that is “emotion detection”)

• Does NOT aim at evaluating the product itself, just the sentiment expressed by the reviewer

Page 5: Sentiment Analysis

Why Sentiment Analysis is challenging

• Keywords are not usually direct“This phone is as modern as the one owned by Alexander Graham Bell”

• Opinions expressed may belong to other people“Many people say iPhones are better than Androids”

• Order Effects“This could have revolutionized phones for ever, but the bundled OS makes it an ultimate letdown”

• Colloquial and domain-specific phrases“The phone runs a 1.2 GHz dual core processor”

Page 6: Sentiment Analysis

Project Overview

• Aims to perform sentiment analysis on cellphone reviews

• Rates the sentiment on a scale of 1 to 5 stars

Page 7: Sentiment Analysis

Inner Workings

• Uses a corpus of several cellphone reviews (currently 33)

• Trains a classifier using features, which may be:– Unigrams (Occurrences of single words)– Bigrams (Occurrences in pairs)– Adjectives only, etc.

• Uses the classifier to classify unknown reviews

Page 8: Sentiment Analysis

Steps

Page 9: Sentiment Analysis

Why Python?

• Less code, more productivity• Flexible paradigms (functional, procedural,

object-oriented, all in one)• Fast development cycle• Wide range of modules

Page 10: Sentiment Analysis

Diving In…

• Modules used:– Python Standard Library (random, sys, etc)– nltk

• Classifiers used:– Naïve Bayes

Page 11: Sentiment Analysis

Diving In… The Algorithm(Unigram Occurrences)

1. Take the entire corpus as input2. Create a list ‘l’ of all documents, each labeled

by its category (i.e., no of stars)3. Extract the ‘n’ most frequent words in the

entire corpus, cleaning up duplicates and non-alphabetic words

Page 12: Sentiment Analysis

Diving In… The Algorithm (Unigram Occurrences)

4. For every document in l:i. Create a dictionary d[l]ii. For each of the n frequent words, put a value in

d[l] indicating presence or absence

5. Divide the dictionary into a training set and a testing set

Page 13: Sentiment Analysis

Diving In… The Algorithm (Unigram Occurrences)

6. Train a Naïve Bayes Classifier using the training set

7. Test the classifier using the testing set and report the accuracy

Page 14: Sentiment Analysis

Next Steps

• Investigating the Maximum Entropy Classifier• Refining feature choice– Negation Tagging– Synonyms

• Investigating Regression techniques

Page 15: Sentiment Analysis

Additional Applications of Sentiment Analysis

• Filtering of SPAM or abusive e-mails• Gauging the mood of people in a particular

network• Government intelligence• Psychological evaluation• Recommendation Systems• Display of ads on webpages

Page 16: Sentiment Analysis

“Sentiment is the poetry of the imagination.”- Alphonse de Lamartine