why sentic?web: connecting people 5 the potential for knowledge sharing today is unmatched in...

Post on 27-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

WHY SENTIC?

Agenda

3

Intro

What do we need?

CLSA Model

Sentic Computing

Research Topics

INTRO

Web: Connecting People

5

The potential for knowledge

sharing today is unmatched

in history: never before

have so many

knowledgeable people been

connected

Leonardo’s Laptop

6

Leonardo’s discoveries and

inventions in science, art,

engineering, and aesthetics,

were based only on his

perception of the world

B. Schneidermann, Leonardo’s Laptop: Human Needs and the New Computing Technologies,

MIT Press (2003)

The Web: A Young Boy

7

Less then 30 years have

elapsed since the invention

of the Web. We are still just

playing with it as we are yet

to discovery how to fully

make use of it

The Web as a Lab

8

The Web today not only

represents an unlimited data

store but also a multi-

disciplinary laboratory

environment for world scale

experiments

B. White, The Web as a Laboratory, Invited Talk at WWW MABSDA (2013)

Eras of the Web

9

The Web is evolving towards

a shared social experience, in

which consumers will rely on

their peers as they make

online decisions and will

shape future products

J. Owyanf & others, The Future of the Social Web, Forrester Research, (2009)

Big Social Data Analysis

10

Between the dawn of the

Internet and 2003, there were 5

exabytes of information on the

Web. Now, we create 5

exabytes every 2 days

E. Cambria, H. Wang, B. White. Big Social Data Analysis, Knowledge Based System, (2009)

The Web 3.0 Dream

11

People’s Fault?

12

Online contents are mostly

meant for human

consumption. Why should

web developers and bloggers

care about making their

content machine

processable?

Technology’s Fault?

13

WHAT DO WE NEED?

Evolution of NLP

15

Natural Language Processing

technologies evolved from the

era of punch cards (7 mins per

sentence) to the era of Google

and its like (less than a second

per sentence)

NLP Emergency

16

In a Web where User

Generated Content has hit

critical mass, NLP is becoming

key for aggregating information

although systems are still

limited by what they can see

More Than We See

17

Language is somewhere in

between perception and

understanding

D. Davidson, Seeing Through Language, Royal Insitute of Philosophy Supplement, 42, (1997)

Understanding Language

18

Natural Language

understanding requires high-

level symbolic capabilities

that most NLP technologies

currently do not possess

M. Dyer, Connectionist natural language processing; A status report, in Computational

Architectures Integrating Neural and Symbolic Processes, vol 292, (1995)

Creation and propagation of dynamic bindings

Manipulation of recursive, constituent structures

Acquisition and access to lexical, semantic and

episodic memories

Control of multiple learning/processing modules

and routing of information among suche modules

Grounding of basic-level language constructs

(objects and actions) in perceptual / motor

experience

Representation of abstract concepts

The Hardest Problem?

19

We can understand almost anything,

but we can’t understand how we understand.

Albert Einstein

We can understand uman mental processes

only slightly better than a fish understands swimming.

John McCarthy

How the mind works is still a mystery.

We understand the hardware,

but we don’t have a clue about the operating system.

James Watson

AI Meets Natural Stupidity

20

The key failure of AI is the

persistency in seeking the best

way to solve a problem, which

leads to the creation of expert

(rather than intelligent) systems

D. McDermott, Artificial Intellingence Meets Natural Stupidity, ACM SIGART Bulletin, 57, (1976)

Machine Learning is Great

21

Recent machine learning

algorithms (Condensed Nearest

Neighbor, Extreme Machine

Learning, Multiple Kernel Learning

and deep learning) can get great

classification accuracy on any

given a labeled dataset

Learning What Again?

22

In most cases, we are simply

teaching machines word co-

occurence frequencies.

It’s like teaching someone

snowboarding by only showing

them videos.

Jumping NLP Curves

23

E. Cambria, B. White, Jumping NLP Curves: A review of Natural Language Processing Research,

IEEE Computational Intellingence Magazin, 9, (2014)

A Possible Path to NLU

24

Natural Language

Processing

Natural

Language

UNDERSTANDING

Sentiment Analysis

Suitecase Research Field

25

This past Saturday, I bought a Nokia phone and my girlfriend

bought a Motorola phone.

We called each other when we got home.

The voice on my phone was not so clear, worse than my

previous phone. The camera was good. My girlfriend was quite

happy with her phone.

I wanted a phone with good voice quality.

So my purchase was a real disappointment.

I returned the phone yesterday.

Deeply Unstructured

26

According to different

evaluation schemes and

reviewers, a very positive and a

very negative review might

both have the same star rating

M. Hu and B. Liu, Mining and Summarizing Customer Reviewrs, in ACP SIGKDD

Conference on Knowledge Discovery and Data Mining, Seattle, (2004)

Sentiment Analysis

27

Sentiment analysis research

evolved from heuristics to

discourse structure, from

coarse to fine-grained analysis,

from keyword to concept-level

mining

E. Cambria, Affective computing and sentiment analysis, IEEE Intelligent Systems 31 (2016)

Keyword Spotting

28

Altough the most naïve

approach, the accessibility and

economy of keyword spotting

make it one of the most popular

A. Orthony, G. Clore and A. Collins, The Cognitive Strucure of Emotions, Cambridge Univ.

Press (1988)

Lexical Affinity

29

Lexical Affinity assigns arbitrary

words probable affinity to

particular emotions (accident

has a 75% probability of

indicating a negative affect)

R. Stevenson & oth, Characterization of the Affective Norms for English Words by Discrete

Emotional Categories, Behavior Research Method, 39(4) (2007)

Statistical Methods

30

By feeding a ML algorithm a

large training corpus,

statistical methods not only

learn the valence of affect

words, but also that of other

arbitrary keywords

R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, A. Ng, C. Potts, Recursive Deep

Models for Semantic Compositionality Over a Sentiment Treebank, EMNLP, (2013)

Concept-Level Analysis

31

By relying on ontologies or

semantic networks, concept-

level approaches step away

from blindly using affect

keywords and word co-

occurrence frequencies

E. Cambria, An Introduction to Concept-Level Sentiment Analysis, LNAI, 8266 (2013)

Conceptualization

32

Concepts are immaterial

entities that only exist in the

mind of the speaker. To be

communicated, they must be

represented in term of some

concrete artifact

S. Ullmann, Semantic: An Introduction to the Science of Meaning, Barnes & Noble, (1979)

Ceci n’est pas une pipe

33

You can know the name of all

the different kinds of pipe, but

you know nothing about a pipe

until you comprehend its

purpose and method of usage

R. Magritte, Les Mots et les Images, La Révolution surréaliste 12, (1929)

From BoW to BoC

34

From BoW to BoC

35

smile sad_smile

damn damn_good

pretty pretty_ugly

THE CLSA MODEL Concept Level

Sentiment Analysis

Concept Level Sentiment Analysis Model

37

E. Cambria, S. Poria, F. Bisio, R. Bajpal and I. Chaturvedi. The CSLA Model: A Novel Framework for Concept-Level

Sentiment Analysis. In CICLing, LNCS 9042, (2015)

Microtext Analysis

38

Before NLP techniques can be

applied, informal text, jargoon,

acronyms and emoticons must

be converted into plain text

K. Bontcheva, L. Derczynski, A. Funk, M.A. Greenwood, D. Maynard and N. Aswani. TwitIE:

An Open-Source Information Extraction Pipeline for Microblog Text, in ACL, (2013)

Semantic Parsing

39

The camera has [long focus time]

The camera takes a [long time] to [focus]

The [focusing] of the camera takes [long time]

The [focus time] of the camera is very [long]

long_focus_time

K. Frantz, S. Anianadou, H. Mima. Automatic recognition of multi-word terms: the C-

value/NC-value method. Internation Journal on Digital Libraries 2, (2000)

Subjectivity Detection

40

Subjectivity Detection is a very

important subtask that consists

in classifying a given text into

one or two classes: objective

(neutral) or subjective (positive/

negative)

B. Pang, L. Lee. Subjectivity Detection and Opinion Identification, in Opinion Mining and

Sentiment Analysis, Now Publishers Inc (2008)

Anaphora Resolution

41

Anaphora is the use of an

expression the interpretation of

which depends upon another

one. It is commonly resolved by

gender and number agreement

R. Mitkov. Anaphora Resolution: the State of the Art. University of Whoverhampton, (1999)

Topic Spotting

42

Topic spotting consists in

assigning category tags to a

piece of text. It includes several

subtasks and can be exploited

fo context-level analysis

Y. Ma, E. Cambria and S. Gao. Label Embedding for Zero-shot Fine-grained Named Entity

Typing. In: COLING, Osaka (2016)

Aspect Extraction

43

S. Poria, E. Cambria and A. Gelbukh. Aspect extraction for opinion mining with a deep

convolutional neural network. Knowledge-Based Systems 108. (2016)

Polarity Detection

44

Early works treated polarity

detections as a bynary

classification problem (pos vs

neg). Recent works calculate

polarity intensity as a float in

the range of [-1, +1]

B. Pang, L. Lee. Opinion mining and sentiment analysis. Foundation and trends in

information retrieval 2. (2008)

Sarcasm Detection

45

Sarcasm transforms the

polarity of an apparently

positive utterance into its

opposite. It is often

characterized by contextual

imbalance and high intensity

S. Poria, E. Cambria, D. Hazarika, and P. Vij. A Deeper Look into Sarcastic Tweets Using

Deep Convolutional Neural Networks. In: COLING, Osaka (2016)

Sentic Computing E. Cambria, A. Hussain.

Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment

Analysis.

Cham: Singer (2015)

3 Shifts

47

From Mono to Multi-

disciplinarity

From Sintax to Semantics

Shift from Statistics to

Linguistics

From Mono- to

Multi-Disciplinarity

48

49

cloud_computing cloud

pain_killer pain, killer

From Syntax to Semantics

50

The car is nice but expensive

The car is expensive but nice

From Statistics to Linguistics

51

Bulding SenticNet

52

Knowledge Acquisition

WordNet Affect

53

WordNet Affect is an

extensione of WordNet

Domains, including a subset of

synsets suitable to represent

concept correlated with

affective words

C. Strapparava and A. Valitutti, WordNet-Affect: An effective extension of WordNet. In:

LREC, Lisbon (2004)

WordNet Affect

54

C. Strapparava and A. Valitutti, WordNet-Affect: An effective extension of WordNet. In:

LREC, Lisbon (2004)

Open Mind Common Sense

55

OMCS is an AI project whose

goal is to build a commonsense

knowledge base from the

contribution of many thousands

of people across the web

R. Speer, Opne Mind Commonds: An inquisitive approach to learning common sense. In:

Workshop on Common Sense and Interactive Application, Honolulu (2007)

Common Knowledge

56

In standard human-to-human

communication, people usually

rely on the presumption that

facts or definitions are known

and proceed to build upon it

E. Cambria, Y.Q. Song, H. Wang and N. Howard. Semantic multi-dimensional scaling for

open domain sentiment analysis. In: IEEE Intelligent Systems 29, (2014)

Commonsense

57

People usually provede only

useful information and take the

rest for granted. The rest is

commonsense: obvious things

people know and usually leave

unstated

E. Cambria, A. Ussain, C. Havasi and C. Eckl. Common Sense Computing: From the

Society of Mind to Digital Intuition and Beyond. In: LNCS 5507, (2009)

GECKA

58

The Game Engine for

Commonsense Knowledge

Acquisition aims to collect

knowledge from game

designers through the

development of games

E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for

Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)

Encoding Knowledge

59

Game designers drag and drop

objects from libraries into

scenes and define their

possible interactions by means

of prerequisite-outcome-goal

(POG) triples

E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for

Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)

Data Collection

60

POG data is encoded and

collected in XML format.

Interactions semantics between

objects and characters are

specified for each scene,

togheter with affect information

E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for

Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)

Affective Information

61

POG specifications not only

allow game designers to define

interaction semantics between

objects, but also affective

reactions of different characters

E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for

Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)

62

Knowledge Representation

AffectNet Graph

63

E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for

concept level sentiment analysis. In: AAAI, Austin (2015)

AffectNet Matrix

64

E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for

concept level sentiment analysis. In: AAAI, Austin (2015)

Affective Space

65

E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for

concept level sentiment analysis. In: AAAI, Austin (2015)

66

Knowledge_Based Reasoning

Sentic Activation

67

E. Cambria, D. Olsher and K. Kwak. Sentic Activation: A two-level Affective Common Sense

Reasoning Framework. In: AAAI, Toronto (2012)

Sentic Panalogy

68

E. Cambria, D. Olsher and K. Kwak. Sentic Panalogy: SwappingAffective Common Sense

Reasoning Strategies. In: CogSci, Sapporo (2012)

Several analogous

representations of the same

problem should be kept in

parallel so that the system can

switch tracks when problem-

solving stalls

Hourglass Model

69

E. Cambria, A. Livingstone and A. Hussain. The Hourglass of Emotions. In: Cognitive

Behavioral Systems, LCNS 7403, Springer (2012)

The mind is made up of

different independent

resources. Turning some sets

of resources on while turning

others off result in different

emotional states

Feeling and Thinking

70

M. Minsky, The Emotions Machines: Commonsense Thinking, Articifial Intelligence, and the

Future of Humand Mind. Simon & Schuster, New York (2006)

The question is not wheter

intelligent machines can have

emotions, but wheter machines

can be intelligent without any

emotions

To Feel or not to Feel?

71

R. Plutchik. The Nature of Emotions. American Scientist 89 (2001)

Hourglass Model

72

Hourglass Model

73

Sentic Neurons

74

L. Oneto, F. Bisio, E. Cambria and D. Anguita. Statistical learning theory and ELM for big

social data analysis, IEEE Computational Intelligence Magazine 11, Springer (2016)

The integration of a bio-

inspired paradigm with principal

component analysis allows for

better comprehension of non-

linearities in AffectiveSpace

Research Topics

draft

Data Mining

77

The exploration of high-rate information streams requires specific techniques

for addressing heterogeneous sources, especially in connection to Social

Networks and blogs.

In this context the normalization of unstructured data plays a crucial role for

lexical/syntax analysis and subsequent semantic processing.

The research aims to create

suitable connectors to data streams (blogs, social networks, forums): each

source has peculiar format and requires specific adaptation of data

gathering

the analysis of gathered data for information extraction (lexicon, possibly

syntax, semantic)

building a graph-like representation of facts and sources to get a

comprehensive representation of the observed phenomenon

Algorithms

78

The handling of massiva data volume requires specific algorithms to ensure

effectiveness and especially efficiency in computation.

The main guidelines include:

efficient algorithms for distributed clustering of text documents

adaptive tools for effective classification, so that the system is capable to

adapt in compliance with the user's experience and expectations

graph-exploration algorithms for interpreting the observed scenario in terms

of semantics and connections among sources

Sentic & Semantics

79

The analysis of texts and sources includes semantic analysis with several

outcomes

the ability to characterize a target aspect (product, brand, idea) with a

representation of the perceived feeling by a target community

the ability to identify possible areas of market development by detecting

areas of possible penetration to cover market deficiencies or prospects

the capability to associate a general Sentic representation with a community

of sources in order to monitor expectations and reactions

Sentic & Semantics

80

The analysis of texts and sources includes semantic analysis with several

outcomes

the ability to characterize a target aspect (product, brand, idea) with a

representation of the perceived feeling by a target community

the ability to identify possible areas of market development by detecting

areas of possible penetration to cover market deficiencies or prospects

the capability to associate a general Sentic representation with a community

of sources in order to monitor expectations and reactions

User Experience

81

The main idea is to endow the system with the capability to adjust its own

behaviour in compliance with the user's preferences.

At run time, the system observes and tracks the user's choices and directives,

and applies the acquired information to its own subsequent selections that are

prompted to the user in future interactions.

So the system creates a specific profile for each specific user. Classifier tools

and semantic techniques are used to build such an integrated profile to

optimize the user's experience.

Thanks to www.sentic.net

Thank You!

top related