#like or #fail how can computers tell the...

33
#like or #fail How Can Computers Tell the Difference? Mark Cieliebak 27.3.2014

Upload: lythien

Post on 06-Mar-2018

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

#like or #fail

How Can Computers Tell the Difference?

Mark Cieliebak 27.3.2014

Page 2: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Institute of Applied Information Systems (InIT)

ZHAW

Zurich University of Applied Sciences

- 10'000 students, 230 professors (FTE)

InIT

Institute of Applied Information Systems

- more than 30 professors

- Labs on Cloud Computing, Data Science etc.

2

[11]

27.3.2014 Mark Cieliebak

Page 3: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

About Me

27.3.2014 Mark Cieliebak 3

Mark Cieliebak Institute of Applied Information Technology (InIT)

ZHAW, Winterthur

Email: [email protected], Website: www.zhaw.ch/~ciel

Text

Analytics

Open

Data

Automated

Test

Generation

Research

Interests

Social

Network

Analysis

Page 4: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

What is "Text Analytics"?

Synonyms: Text Mining, Natural Language Processing (NLP)

4 Picture Sources: see slide "References"

[6]

[5]

[3]

[2]

[1]

[4]

27.3.2014 Mark Cieliebak

Page 5: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Sample Application: Social Media

Monitoring

Text Analytics Components:

• Find relevant documents

• Hot topic Analysis

• sentiment analysis

27.3.2014 Mark Cieliebak 5

[7]

Page 6: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Sentiment Analysis for Product Reviews

"… WiFi Analytics is a free Android app that I find very

handy when it comes to troubleshooting and monitoring

a home network. " [8]

27.3.2014 Mark Cieliebak 6

Page 7: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Flavours of Sentiment Analysis

• Document Based

• Sentence Based

• Target-Specific

• Rating Prediction

27.3.2014 Mark Cieliebak 7

Page 8: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Simple Sentiment Analysis

Idea: Count number of positive and negative words

"This analysis is good[+1]." +1 (pos)

"I find it beautiful[+1] and good[+1]." +2 (pos)

"It looks terrible[-1]." -1 (neg)

"This car has a blue color." 0 (neu)

POSITIVE:

good

love

nice

...

NEUTRAL:

hello

see

I

NEGATIVE:

bad

hate

ugly

...

Use Sentiment-Dictionary:

27.3.2014 Mark Cieliebak 8

Page 9: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Improvements

Sample Text

• "This analysis is good."

"This analysis is excellent."

• "The car is really very

expensive."

• "This car has an appealing

design and comfortable

seats, but it is expensive."

Solution

• Fine-Grained Dictionaries:

"This analysis is good [+2]. "

"This analysis is excellent [+3].

• Detect Booster Words:

"The car is really very

expensive[-1 -1 -2] ."

• New Category "Mixed":

"This car has an appealing[+1]

design and comfortable[+1]

seats, but it is expensive[-1]. "

mix(+2/-1)

27.3.2014 Mark Cieliebak 9

Page 10: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Improvements 2

Sample Text

• This analysis is not good.

• The car is appealing and I do

not find it expensive.

• I do not find the car expensive

and it is appealing.

Solution

• Invert all Scores:

"This analysis is not[*-1]

good[+3]."

• Invert only score of words

occuring after the negation:

"The car is appealing[+3] and I

do not[*-1] find it expensive[-2]"

Need to “understand” the

sentence

Need linguistic analysis!

27.3.2014 Mark Cieliebak 10

Page 11: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Linguistic Analysis

-> RULE: Invert scores of words being in the same phrases as negation.

“I do not find the car expensive[+2]

and it is appealing[+3].” → +5 (pos)

Sentence

Sentence Conj. Sentence

Noun

Phrase

Verb Phrase

Verb Adverb Verb Noun

Phrase

Adj. Noun

Phrase

Verb Phrase

Det. Det Noun Det. Verb Participle

I do not find the car expensive and it is appealing

27.3.2014 Mark Cieliebak 11

Page 12: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Technology Stack for Sentiment Analysis on Web Documents using Linguistic Analysis

Linguistic Framework • Gate: text analysis framework with experimentation GUI; includes sentence splitter, tokenizer, chunker, pos-tagger, etc; plugin-

mechanism for new libraries; very widely used. http://gate.ac.uk/

• OpenNLP: Java-based, easy to use; includes tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, etc;

by Apache Foundation. http://opennlp.apache.org/

Web Boilerplate Extraction • Boilerpipe: removes ads, menus, HTML-tags etc. http://boilerpipe-web.appspot.com/

Language Analysis • Nutch language identifier plugin: 14 languages preimplemented, „easy“ to add more. http://wiki.apache.org/nutch/LanguageIdentifierPlugin

Sentence Splitting • OpenNLP: very easy to use, in Java. http://opennlp.apache.org/

• Punkt. In Python http://nltk.org/api/nltk.tokenize.html

Part-of-Speech Tagger • TreeTagger: language-independent, several languages implemented;. http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/

• TweetNLP: POS-tagger just for Twitter. http://www.ark.cs.cmu.edu/TweetNLP/

Chunker • TreeTagger: supports English, German, French. http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/

Dictionaries • WordNet: large lexical database of English. http://wordnet.princeton.edu/

• SentiWordNet: English dictionalry with sentiment annotations. http://sentiwordnet.isti.cnr.it/

27.3.2014 Mark Cieliebak 12

Page 13: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Main Approaches to Sentiment Analysis

Rule-Based Corpus-Based

27.3.2014 Mark Cieliebak 13

Predicted

Label

[9] [10]

Page 14: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Corpus-Based Sentiment Analysis

27.3.2014 Mark Cieliebak 14

Predicted

Label

[10]

Page 15: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Corpus-Based Sentiment Analysis

Annotated Corpus

Sentence Polarity

This analysis is good. Pos

It looks awful. Neg

This car has a blue color. Neu

This car has an appealing design, comfortable seats, but it is expensive. Mix

This car has a very appealing design, comfortable seats, but it is really expensive. Mix

This analysis is not good. Neg

This car has an appealing design, comfortable seats and it is not expensive. Mix

This movie was like a horror event. Neg

This car is appealing and is not expensive. Mix

... ...

27.3.2014 Mark Cieliebak 15

Page 16: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Sample Features for Tweets

• Word ngrams: presence or absence of contiguous sequences of 1, 2, 3,

and 4 tokens; noncontiguous ngrams

• POS: the number of occurrences of each part-of-speech tag

• Sentiment Lexica: each word annotated with tonality score (-1..0..+1)

• Negation: the number of negated contexts

• Punctuation: the number of contiguous sequences of exclamation marks,

question marks, and both exclamation and question marks

• Emoticons: presence or absence, last token is a positive or negative

emoticon;

• Hashtags: the number of hashtags;

• Elongated words: the number of words with one character repeated (e.g.

‘soooo’) from: Mohammad et al., SemEval 2013

27.3.2014 Mark Cieliebak 16

Page 17: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

How good are Sentiment Analysis Tools?

27.3.2014 Mark Cieliebak 17

Page 18: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Evaluation joint work with Oliver Dürr and Fatih Uzdilli [15]

• 7 Public Text Corpora

– Single Statements

– Different Media Types

• Tweet, News, Review,

Speech Transcript

– Total: 28653 Texts

• 9 Commercial APIs

– Stand-alone

– Free for this evaluation

– Arbitrary Text

NEGATIVE POSITIVE

OTHER ( neutral / mixed )

18 27.3.2014 Mark Cieliebak

Page 19: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Sentiment Analysis Tools

AlchemyAPI www.alchemyapi.com

Lymbix www.lymbix.com

ML Analyzer www.mashape.com/mlanalyzer/ml-analyzer

Repustate www.repustate.com

Semantria www.semantria.com

Sentigem www.sentigem.com

Skyttle www.skyttle.com

Textalytics core.textalytics.com

Text-processing www.text-processing.com

27.3.2014 Mark Cieliebak 19

Page 20: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Tool Accuracy

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

racy

Best Tool per Corpus

Worst Tool per Corpus

20

61%

40%

Avg.

27.3.2014 Mark Cieliebak

Page 21: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Tool Accuracy

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

racy

Best Tool per Corpus

Worst Tool per Corpus

Overall Best Tool

21

61% 40% 59%

Avg.

27.3.2014 Mark Cieliebak

Page 22: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Summary

"They all suck…and we suck, too."

CEO of a Sentiment

Analysis Company

27.3.2014 Mark Cieliebak 22

Page 23: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

State of the Art in Academia

SemEval 2013: Competition on Semantic Analysis

• Task 2A: Contextual Polarity Disambiguation

• Task 2B: Message Polarity Classification

Corpus: 15'000 tweets from 2012 on various topics

annotated via Mechanical Turk

27.3.2014 Mark Cieliebak 23

Page 24: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

SemEval 2013 Task 2B

Results

• 36 Submissions

• Average F1 = 54%

• Only 9 teams above 50%, all others below!

• Best Team: F1 = 69%

• For comparison: best commercial tool has F1 = 62%

27.3.2014 Mark Cieliebak 24

Page 25: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Can a Meta-Classifier do better?

1st Approach: Majority Classifier

Sentiment with most votes chosen

Illustration:

25 27.3.2014 Mark Cieliebak

api1 api2 api3 api4 api5 api6 api7 Majority

Text 1 + + - o - + o +

Text 2 - + + - - - - -

Text 3 - o + + + + - +

Text n o o + o - o o o

Page 26: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Tool Accuracy

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

rac

y

Best Tool per Corpus

Worst Tool per Corpus

26 27.3.2014 Mark Cieliebak

Page 27: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Majority Classifier Close to Best Tool

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

rac

y

Best Tool per Corpus

Worst Tool per Corpus

Majority Classifier

27 27.3.2014 Mark Cieliebak

Page 28: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

2nd Approach: Random-Forest

28 27.3.2014 Mark Cieliebak

api1 api2 api3 api9 annotation

Text 1 + - + o +

Text 2 - + o + -

Text 3 - o - + -

Text 4 + o + - +

Text 5 + o + o o

Text 6 + o o - o

Text 7 + - + o unknown

Text 8 + + o - unknown

Text 9 o - + o unknown

Random

Forest

Classifier

+

+

o

Train

Train

Train

Train

Predict

Predict

Predict

Train

Train

Page 29: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Before Random Forest

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

rac

y

Best Tool per Corpus

Worst Tool per Corpus

Majority Classifier

29 27.3.2014 Mark Cieliebak

Page 30: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Random Forest beats Best Single Tool

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Accu

rac

y

Best Tool per Corpus

Worst Tool per Corpus

Majority Classifier

Random Forest Classifier

30 27.3.2014 Mark Cieliebak

Page 31: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Challenges in Sentiment Analysis

• "Salaries for software engineers are extremely high."

Context Dependencies

• “What a great car, it stopped working the second day.”

Sarcastic Statements

• "#YouCantDateMe if u still sag ur pants super hard...dat shit is

played the fuck out!!! "

Informal Language

• Huge Effort for New Languages

27.3.2014 Mark Cieliebak

Page 32: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

Talk in Short!

1. Sentiment analysis is an

important challenge

2. Commercial tools classify 6

out of 10 docs wrong

3. Academic systems are

not really better

4. Random forest outperform

even the best tools

27.3.2014 Mark Cieliebak 32

[14]

[12]

[13]

Page 33: #like or #fail How Can Computers Tell the Difference?people.inf.ethz.ch/jaggim/meetup/2/slides/ML-Meetup-2-Cieliebak.pdf · How Can Computers Tell the Difference? ... Verb Adverb

References

[1] http://www.teleportmyjob.com/blog/wp-content/uploads/2013/05/find_job.jpg

[2] http://screenshots.de.sftcdn.net/de/scrn/3340000/3340669/swiftkey-23-630x535.png

[3] https://lh4.ggpht.com/Fs4HLaMCgmY6O7BWGKZB8LdOM8GmiRAbKUcdJysNp0k74xmPMTSGVHigwvaLy_Tw=w300

[4] http://researcher.watson.ibm.com/researcher/files/us-bansal/watson2.jpg

[5] http://www.betadaily.com/wp-content/uploads/2011/02/BusinessPlanExecutiveSummary.jpg

[6] http://everycook.org/cms/en/hardware-en/pictures-en#

[7] http://cdn.fulltraffic.net/images/blog/social-media-monitoring-process.png

[8] http://www.pcmag.com/article2/0,2817,2426416,00.asp

[9] http://i1059.photobucket.com/albums/t424/shmi11y/UoS%20LingSite/tompushedthecar.png

[10] http://bigsnarf.files.wordpress.com/2013/04/supervised.png

[11] http://www.alumni.sml.zhaw.ch/portals/0/Volkartgeb%C3%A4ude%20gek%C3%BCrzt%202.jpg

[12] http://www.bitspeed.com/wp-content/uploads/2011/09/Sign-multi-cast.jpg

[13] http://wwwdelivery.superstock.com/WI/223/1829/PreviewComp/SuperStock_1829-47184.jpg

[14] http://www.odditycentral.com/art/arborsculpture-the-art-of-turning-young-trees-into-living-works-of-art.html

[15] Cieliebak, Dürr, Uzdilli, ESSEM 2013

27.3.2014 Mark Cieliebak 33