why sentic?web: connecting people 5 the potential for knowledge sharing today is unmatched in...
TRANSCRIPT
WHY SENTIC?
Agenda
3
Intro
What do we need?
CLSA Model
Sentic Computing
Research Topics
INTRO
Web: Connecting People
5
The potential for knowledge
sharing today is unmatched
in history: never before
have so many
knowledgeable people been
connected
Leonardo’s Laptop
6
Leonardo’s discoveries and
inventions in science, art,
engineering, and aesthetics,
were based only on his
perception of the world
B. Schneidermann, Leonardo’s Laptop: Human Needs and the New Computing Technologies,
MIT Press (2003)
The Web: A Young Boy
7
Less then 30 years have
elapsed since the invention
of the Web. We are still just
playing with it as we are yet
to discovery how to fully
make use of it
The Web as a Lab
8
The Web today not only
represents an unlimited data
store but also a multi-
disciplinary laboratory
environment for world scale
experiments
B. White, The Web as a Laboratory, Invited Talk at WWW MABSDA (2013)
Eras of the Web
9
The Web is evolving towards
a shared social experience, in
which consumers will rely on
their peers as they make
online decisions and will
shape future products
J. Owyanf & others, The Future of the Social Web, Forrester Research, (2009)
Big Social Data Analysis
10
Between the dawn of the
Internet and 2003, there were 5
exabytes of information on the
Web. Now, we create 5
exabytes every 2 days
E. Cambria, H. Wang, B. White. Big Social Data Analysis, Knowledge Based System, (2009)
The Web 3.0 Dream
11
People’s Fault?
12
Online contents are mostly
meant for human
consumption. Why should
web developers and bloggers
care about making their
content machine
processable?
Technology’s Fault?
13
WHAT DO WE NEED?
Evolution of NLP
15
Natural Language Processing
technologies evolved from the
era of punch cards (7 mins per
sentence) to the era of Google
and its like (less than a second
per sentence)
NLP Emergency
16
In a Web where User
Generated Content has hit
critical mass, NLP is becoming
key for aggregating information
although systems are still
limited by what they can see
More Than We See
17
Language is somewhere in
between perception and
understanding
D. Davidson, Seeing Through Language, Royal Insitute of Philosophy Supplement, 42, (1997)
Understanding Language
18
Natural Language
understanding requires high-
level symbolic capabilities
that most NLP technologies
currently do not possess
M. Dyer, Connectionist natural language processing; A status report, in Computational
Architectures Integrating Neural and Symbolic Processes, vol 292, (1995)
Creation and propagation of dynamic bindings
Manipulation of recursive, constituent structures
Acquisition and access to lexical, semantic and
episodic memories
Control of multiple learning/processing modules
and routing of information among suche modules
Grounding of basic-level language constructs
(objects and actions) in perceptual / motor
experience
Representation of abstract concepts
The Hardest Problem?
19
We can understand almost anything,
but we can’t understand how we understand.
Albert Einstein
We can understand uman mental processes
only slightly better than a fish understands swimming.
John McCarthy
How the mind works is still a mystery.
We understand the hardware,
but we don’t have a clue about the operating system.
James Watson
AI Meets Natural Stupidity
20
The key failure of AI is the
persistency in seeking the best
way to solve a problem, which
leads to the creation of expert
(rather than intelligent) systems
D. McDermott, Artificial Intellingence Meets Natural Stupidity, ACM SIGART Bulletin, 57, (1976)
Machine Learning is Great
21
Recent machine learning
algorithms (Condensed Nearest
Neighbor, Extreme Machine
Learning, Multiple Kernel Learning
and deep learning) can get great
classification accuracy on any
given a labeled dataset
Learning What Again?
22
In most cases, we are simply
teaching machines word co-
occurence frequencies.
It’s like teaching someone
snowboarding by only showing
them videos.
Jumping NLP Curves
23
E. Cambria, B. White, Jumping NLP Curves: A review of Natural Language Processing Research,
IEEE Computational Intellingence Magazin, 9, (2014)
A Possible Path to NLU
24
Natural Language
Processing
Natural
Language
UNDERSTANDING
Sentiment Analysis
Suitecase Research Field
25
This past Saturday, I bought a Nokia phone and my girlfriend
bought a Motorola phone.
We called each other when we got home.
The voice on my phone was not so clear, worse than my
previous phone. The camera was good. My girlfriend was quite
happy with her phone.
I wanted a phone with good voice quality.
So my purchase was a real disappointment.
I returned the phone yesterday.
Deeply Unstructured
26
According to different
evaluation schemes and
reviewers, a very positive and a
very negative review might
both have the same star rating
M. Hu and B. Liu, Mining and Summarizing Customer Reviewrs, in ACP SIGKDD
Conference on Knowledge Discovery and Data Mining, Seattle, (2004)
Sentiment Analysis
27
Sentiment analysis research
evolved from heuristics to
discourse structure, from
coarse to fine-grained analysis,
from keyword to concept-level
mining
E. Cambria, Affective computing and sentiment analysis, IEEE Intelligent Systems 31 (2016)
Keyword Spotting
28
Altough the most naïve
approach, the accessibility and
economy of keyword spotting
make it one of the most popular
A. Orthony, G. Clore and A. Collins, The Cognitive Strucure of Emotions, Cambridge Univ.
Press (1988)
Lexical Affinity
29
Lexical Affinity assigns arbitrary
words probable affinity to
particular emotions (accident
has a 75% probability of
indicating a negative affect)
R. Stevenson & oth, Characterization of the Affective Norms for English Words by Discrete
Emotional Categories, Behavior Research Method, 39(4) (2007)
Statistical Methods
30
By feeding a ML algorithm a
large training corpus,
statistical methods not only
learn the valence of affect
words, but also that of other
arbitrary keywords
R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, A. Ng, C. Potts, Recursive Deep
Models for Semantic Compositionality Over a Sentiment Treebank, EMNLP, (2013)
Concept-Level Analysis
31
By relying on ontologies or
semantic networks, concept-
level approaches step away
from blindly using affect
keywords and word co-
occurrence frequencies
E. Cambria, An Introduction to Concept-Level Sentiment Analysis, LNAI, 8266 (2013)
Conceptualization
32
Concepts are immaterial
entities that only exist in the
mind of the speaker. To be
communicated, they must be
represented in term of some
concrete artifact
S. Ullmann, Semantic: An Introduction to the Science of Meaning, Barnes & Noble, (1979)
Ceci n’est pas une pipe
33
You can know the name of all
the different kinds of pipe, but
you know nothing about a pipe
until you comprehend its
purpose and method of usage
R. Magritte, Les Mots et les Images, La Révolution surréaliste 12, (1929)
From BoW to BoC
34
From BoW to BoC
35
smile sad_smile
damn damn_good
pretty pretty_ugly
THE CLSA MODEL Concept Level
Sentiment Analysis
Concept Level Sentiment Analysis Model
37
E. Cambria, S. Poria, F. Bisio, R. Bajpal and I. Chaturvedi. The CSLA Model: A Novel Framework for Concept-Level
Sentiment Analysis. In CICLing, LNCS 9042, (2015)
Microtext Analysis
38
Before NLP techniques can be
applied, informal text, jargoon,
acronyms and emoticons must
be converted into plain text
K. Bontcheva, L. Derczynski, A. Funk, M.A. Greenwood, D. Maynard and N. Aswani. TwitIE:
An Open-Source Information Extraction Pipeline for Microblog Text, in ACL, (2013)
Semantic Parsing
39
The camera has [long focus time]
The camera takes a [long time] to [focus]
The [focusing] of the camera takes [long time]
The [focus time] of the camera is very [long]
long_focus_time
K. Frantz, S. Anianadou, H. Mima. Automatic recognition of multi-word terms: the C-
value/NC-value method. Internation Journal on Digital Libraries 2, (2000)
Subjectivity Detection
40
Subjectivity Detection is a very
important subtask that consists
in classifying a given text into
one or two classes: objective
(neutral) or subjective (positive/
negative)
B. Pang, L. Lee. Subjectivity Detection and Opinion Identification, in Opinion Mining and
Sentiment Analysis, Now Publishers Inc (2008)
Anaphora Resolution
41
Anaphora is the use of an
expression the interpretation of
which depends upon another
one. It is commonly resolved by
gender and number agreement
R. Mitkov. Anaphora Resolution: the State of the Art. University of Whoverhampton, (1999)
Topic Spotting
42
Topic spotting consists in
assigning category tags to a
piece of text. It includes several
subtasks and can be exploited
fo context-level analysis
Y. Ma, E. Cambria and S. Gao. Label Embedding for Zero-shot Fine-grained Named Entity
Typing. In: COLING, Osaka (2016)
Aspect Extraction
43
S. Poria, E. Cambria and A. Gelbukh. Aspect extraction for opinion mining with a deep
convolutional neural network. Knowledge-Based Systems 108. (2016)
Polarity Detection
44
Early works treated polarity
detections as a bynary
classification problem (pos vs
neg). Recent works calculate
polarity intensity as a float in
the range of [-1, +1]
B. Pang, L. Lee. Opinion mining and sentiment analysis. Foundation and trends in
information retrieval 2. (2008)
Sarcasm Detection
45
Sarcasm transforms the
polarity of an apparently
positive utterance into its
opposite. It is often
characterized by contextual
imbalance and high intensity
S. Poria, E. Cambria, D. Hazarika, and P. Vij. A Deeper Look into Sarcastic Tweets Using
Deep Convolutional Neural Networks. In: COLING, Osaka (2016)
Sentic Computing E. Cambria, A. Hussain.
Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment
Analysis.
Cham: Singer (2015)
3 Shifts
47
From Mono to Multi-
disciplinarity
From Sintax to Semantics
Shift from Statistics to
Linguistics
From Mono- to
Multi-Disciplinarity
48
49
cloud_computing cloud
pain_killer pain, killer
From Syntax to Semantics
50
The car is nice but expensive
The car is expensive but nice
From Statistics to Linguistics
51
Bulding SenticNet
52
Knowledge Acquisition
WordNet Affect
53
WordNet Affect is an
extensione of WordNet
Domains, including a subset of
synsets suitable to represent
concept correlated with
affective words
C. Strapparava and A. Valitutti, WordNet-Affect: An effective extension of WordNet. In:
LREC, Lisbon (2004)
WordNet Affect
54
C. Strapparava and A. Valitutti, WordNet-Affect: An effective extension of WordNet. In:
LREC, Lisbon (2004)
Open Mind Common Sense
55
OMCS is an AI project whose
goal is to build a commonsense
knowledge base from the
contribution of many thousands
of people across the web
R. Speer, Opne Mind Commonds: An inquisitive approach to learning common sense. In:
Workshop on Common Sense and Interactive Application, Honolulu (2007)
Common Knowledge
56
In standard human-to-human
communication, people usually
rely on the presumption that
facts or definitions are known
and proceed to build upon it
E. Cambria, Y.Q. Song, H. Wang and N. Howard. Semantic multi-dimensional scaling for
open domain sentiment analysis. In: IEEE Intelligent Systems 29, (2014)
Commonsense
57
People usually provede only
useful information and take the
rest for granted. The rest is
commonsense: obvious things
people know and usually leave
unstated
E. Cambria, A. Ussain, C. Havasi and C. Eckl. Common Sense Computing: From the
Society of Mind to Digital Intuition and Beyond. In: LNCS 5507, (2009)
GECKA
58
The Game Engine for
Commonsense Knowledge
Acquisition aims to collect
knowledge from game
designers through the
development of games
E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for
Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)
Encoding Knowledge
59
Game designers drag and drop
objects from libraries into
scenes and define their
possible interactions by means
of prerequisite-outcome-goal
(POG) triples
E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for
Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)
Data Collection
60
POG data is encoded and
collected in XML format.
Interactions semantics between
objects and characters are
specified for each scene,
togheter with affect information
E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for
Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)
Affective Information
61
POG specifications not only
allow game designers to define
interaction semantics between
objects, but also affective
reactions of different characters
E. Cambria, D. Olsher, D. Rajagopal and K. Kwak. GECKA: Game Engine for
Commonsense Knowledge Acquisition. In: FLAIRS, Hollywood (2015)
62
Knowledge Representation
AffectNet Graph
63
E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for
concept level sentiment analysis. In: AAAI, Austin (2015)
AffectNet Matrix
64
E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for
concept level sentiment analysis. In: AAAI, Austin (2015)
Affective Space
65
E. Cambria, J. Fu, F. Bisio and S. Poria. AffectiveSpace 2: Enabling affective intuition for
concept level sentiment analysis. In: AAAI, Austin (2015)
66
Knowledge_Based Reasoning
Sentic Activation
67
E. Cambria, D. Olsher and K. Kwak. Sentic Activation: A two-level Affective Common Sense
Reasoning Framework. In: AAAI, Toronto (2012)
Sentic Panalogy
68
E. Cambria, D. Olsher and K. Kwak. Sentic Panalogy: SwappingAffective Common Sense
Reasoning Strategies. In: CogSci, Sapporo (2012)
Several analogous
representations of the same
problem should be kept in
parallel so that the system can
switch tracks when problem-
solving stalls
Hourglass Model
69
E. Cambria, A. Livingstone and A. Hussain. The Hourglass of Emotions. In: Cognitive
Behavioral Systems, LCNS 7403, Springer (2012)
The mind is made up of
different independent
resources. Turning some sets
of resources on while turning
others off result in different
emotional states
Feeling and Thinking
70
M. Minsky, The Emotions Machines: Commonsense Thinking, Articifial Intelligence, and the
Future of Humand Mind. Simon & Schuster, New York (2006)
The question is not wheter
intelligent machines can have
emotions, but wheter machines
can be intelligent without any
emotions
To Feel or not to Feel?
71
R. Plutchik. The Nature of Emotions. American Scientist 89 (2001)
Hourglass Model
72
Hourglass Model
73
Sentic Neurons
74
L. Oneto, F. Bisio, E. Cambria and D. Anguita. Statistical learning theory and ELM for big
social data analysis, IEEE Computational Intelligence Magazine 11, Springer (2016)
The integration of a bio-
inspired paradigm with principal
component analysis allows for
better comprehension of non-
linearities in AffectiveSpace
Internet Marketing Concept Map
75
http://www.conceptdraw.com/examples/concept-map-of-marketing
Research Topics
draft
Data Mining
77
The exploration of high-rate information streams requires specific techniques
for addressing heterogeneous sources, especially in connection to Social
Networks and blogs.
In this context the normalization of unstructured data plays a crucial role for
lexical/syntax analysis and subsequent semantic processing.
The research aims to create
suitable connectors to data streams (blogs, social networks, forums): each
source has peculiar format and requires specific adaptation of data
gathering
the analysis of gathered data for information extraction (lexicon, possibly
syntax, semantic)
building a graph-like representation of facts and sources to get a
comprehensive representation of the observed phenomenon
Algorithms
78
The handling of massiva data volume requires specific algorithms to ensure
effectiveness and especially efficiency in computation.
The main guidelines include:
efficient algorithms for distributed clustering of text documents
adaptive tools for effective classification, so that the system is capable to
adapt in compliance with the user's experience and expectations
graph-exploration algorithms for interpreting the observed scenario in terms
of semantics and connections among sources
Sentic & Semantics
79
The analysis of texts and sources includes semantic analysis with several
outcomes
the ability to characterize a target aspect (product, brand, idea) with a
representation of the perceived feeling by a target community
the ability to identify possible areas of market development by detecting
areas of possible penetration to cover market deficiencies or prospects
the capability to associate a general Sentic representation with a community
of sources in order to monitor expectations and reactions
Sentic & Semantics
80
The analysis of texts and sources includes semantic analysis with several
outcomes
the ability to characterize a target aspect (product, brand, idea) with a
representation of the perceived feeling by a target community
the ability to identify possible areas of market development by detecting
areas of possible penetration to cover market deficiencies or prospects
the capability to associate a general Sentic representation with a community
of sources in order to monitor expectations and reactions
User Experience
81
The main idea is to endow the system with the capability to adjust its own
behaviour in compliance with the user's preferences.
At run time, the system observes and tracks the user's choices and directives,
and applies the acquired information to its own subsequent selections that are
prompted to the user in future interactions.
So the system creates a specific profile for each specific user. Classifier tools
and semantic techniques are used to build such an integrated profile to
optimize the user's experience.
Thanks to www.sentic.net
Thank You!