sentiment analysis of film-related messages on social media

Sentiment Analysis of Film- related Messages on Social Media Christopher Burdorf NBCUniversal

Upload: domino-data-lab

Post on 05-Jan-2017

81 views

Category:

Technology

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Sentiment Analysis of Film-related Messages on Social Media

Christopher BurdorfNBCUniversal

“The big gamblers are not in Vegas, they are in Hollywood”

Animation Director

Page 3: Sentiment Analysis of Film-Related Messages on Social Media

Sentiment Analysis of Social Media

Process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc., is positive, negative, or neutral.Facebook public messages – DataSiftTwitter tweets – public API (only 1%), Twitter Gnip Firehose (100%)

Stanford CoreNLP -https://github.com/stanfordnlp/CoreNLP Natural Language processing system which uses deep learning techniques to process sentiment.

https://github.com/stanfordnlp/CoreNLP

Page 4: Sentiment Analysis of Film-Related Messages on Social Media

Deep LearningCoreNLP uses RNTN (Recurrent Neural Tensor Networks)RNTNs use compositional vector representations for phrases of variable length and syntactic type.Used as features to classify each word and phrase within a sentenceComputes overall sentiment based on vector values for words and phrases it has been trained to recognize.Sentiment ranges from 0 – very negative to 4 – very positive

Page 5: Sentiment Analysis of Film-Related Messages on Social Media

RNTN ModelRNTNs represent a phrase through word vectors and a parse tree and then compute vectors for higher nodes in the tree using the same tensor-based composition function.

Page 6: Sentiment Analysis of Film-Related Messages on Social Media

Film Sentiment: FSOG

Save messages from Datasift Facebook public stream referencing Fifty Shades of Grey. Store in HBaseStored 130,000 Facebook messages over a two-week period surrounding the films opening (opening date Feb 13)Stored 300MB of Facebook message JSON data.Process sentiment analysis on the messages using different training models using parallel Scala collections.

Page 7: Sentiment Analysis of Film-Related Messages on Social Media

ExampleNo model: Sentiment= 1, “Tonight we're feeling Romantically Involved #fiftyShades”(4 (3 (2 Tonight) (3 (3 we're) (3 feeling))) (4 (4 Romantically) (4 Involved)))With Model: Sentiment= 4, “Tonight we're feeling Romantically Involved”Can match phrases as well (eg. “can't wait”).

Page 8: Sentiment Analysis of Film-Related Messages on Social Media

Facebook message counts: FSOG

Page 9: Sentiment Analysis of Film-Related Messages on Social Media

Training Models: FSOG Median

Page 10: Sentiment Analysis of Film-Related Messages on Social Media

Statistical Sampling

Manual assignment of sentiments on a statistically significant sampling of messages95% confidence level 7% margin of errorCompare result to training model results

Page 11: Sentiment Analysis of Film-Related Messages on Social Media

Sampling Results

Page 12: Sentiment Analysis of Film-Related Messages on Social Media

Performance IssuesSpam: 80% Tweets are spam. Facebook messages about 10% spam.Spam filtering using matching phrases vs H20 Deep Learning.Training performance improvements: took 8 hours to train full plus movie critic set worked with Standford NLP group to multithread – reduced training time to 1 hour.

Page 13: Sentiment Analysis of Film-Related Messages on Social Media

Performance ImprovementsSentiment lookup performance improvements – 6 hours to analyze 130k messagesSwitched to distributed database (Cassandra) and implemented concurrent lookups using Akka Actors resulted in 7x speedup on 16 cores

Page 14: Sentiment Analysis of Film-Related Messages on Social Media

Other languages

Other LanguagesTwitter Firehose is 40% English. other languages (eg. Spanish) are seeing prominent usage as well. 77% of Twitter's 284 million MAUs (Monthly Active Users) are located outside the USA. 82% of Facebook's 890 million DAUs (Daily Active Users) are located outside the USA and Canada.

Sentiment e

Sentiment Analysis Introduction to - unipi.itdidawiki.di.unipi.it/.../mds/txa/introduction_to_sentiment_analysis.pdf · Sentiment Analysis is not a dataset. Sentiment Analysis is

Teknosa İç ve Dış Ticaret A.Ş. · TEKNOSA İÇ VE DIŞ TİCARET A.Ş INVESTOR PRESENTATION Genel Main messages 3 Recessionary economic conditions and negative consumer sentiment

Sentiment Analysis of Greek Tweets and Hashtags using ...hashtag.nonrelevant.net/Sentiment Analysis of Greek... · sentiment rating for the Greek tweets, for a variety of sentiment

Introduction to Sentiment Analysis - ETH Z · sentiment ! Sentiment analysis is also known as opinion mining L Sanders 3 What is Sentiment Analysis Sentiment analysis is the operation

Exploring messages and values in film

Tree Communication Models for Sentiment Analysis · sentiment analysis over Stanford Sentiment Treebank, which allows the sentiment signals over hierarchical phrase structures to

Sentiment Analysis & Opinion Mining€¦ · Sentiment Analysis Sentiment Classification System Experimente Perspektiven * Abbildung dem Sinn nach entnommen aus Heyer (2006: 5). Sentiment

Success prediction of upcoming movies through lexicon- and ... · We are predicting movie success using sentiment analysis on extracted messages on Twitter. To determine the movie

arXiv:1507.00955v3 [cs.CL] 18 Sep 2015 icon, emoticonsIn (Souza et al., 2015) our algorithm for sentiment analysis is also successfully applied to 42,803,225 Twitter messages related

Election 2016 Twitter Sentiment Map - web.stanford.edu · (2.3) Python script analyzes each tweet for sentiment (2.4) Sentiment returned to server (2.5) Tweet and sentiment sent to

Negative Sentiment (or "Sentiment Analysis is Sh*te")