data representation for social media using emojis as ... · 1 primrose street, london ec2a 2ex...

Post on 21-May-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Using emojis as universal sentence representation for social media

data

Alexis DUTOT-

22/05/2019

PARIS NLP S3

MEETUP #5

2

● Introduction

● DeepMoji

● Internal challenges

● Our approach: Unimoji

● Conclusion & perspectives

Introduction

3

4

Introduction

Linkfluence

- Social Media Intelligence company

- Activities: software & market research

- 2 products:

- Radarly

- Search

- 250+ employees over 6 offices

5

Our day-to-day work

Research Production

- Read papers

- Technological watch

- Prototype new features

- Train models

- Science popularization

- Implement new features to fit in the

production pipeline (near real-time

inference)

- Build batch computations for AI features

not computed in real-time

- Enhance the processing pipeline

Introduction

6

Our day-to-day work

6

Production environment

- Research playground

- Machine learning & NLP toolkits

- Programming languages

Introduction

7

Our pipeline

Language detection

NER extraction

Categorization

Opinion mining

Location & user inference

Stats:

● ~ 1200 documents per second

● > 60 languages

● > 10 platforms (social medias & web)

● > 65 models in the pipeline

Introduction

8

Our pipeline

Stats:

● ~ 1200 documents per second

● > 60 languages

● > 10 platforms (social medias & web)

● > 65 models in the pipeline

Introduction

Language detection

NER extraction

Categorization

Opinion mining

Location & user inference

Opinion miningIntroduction

- Sentiment Analysis: Document-level

sentiment analysis with 4 classes: positive,

negative, neutral and mixed

- Emotion detection: Document-level

multi-emotion detection with 7 classes:

anger, disgust, fear, joy, love, sadness and

surprise

9

Introduction

● Initial goal: enhance the sentiment analysis algorithm that was in the production pipeline

● Challenges:

○ Social media posts are noisy user-generated content: spelling mistakes, grammatical errors,

contractions, abbreviations, specific terms, ...

○ Very few annotated corpora available with few examples per corpus

○ The majority of these corpora are in English and “domain-specific”

10

Sentiment analysis task for social media data is limited by the scarcity of manually

annotated data

Opinion mining

Opinion miningIntroduction

Use distant supervision methods to make models learn useful text representations (like emotional

content) before modeling these tasks directly:

● Use specific hashtags: #good, #bad, #angry, #fml to automatically label high volume of data

(Mohammad, 2012)

● Use predefined positive and negative emoticons or emojis sets for automatic data labelling (Deriu

et al., 2016, Tang et al., 2014) → Our previous sentiment analysis model

● Pre-train a model to predict emojis given a document to learn a rich emotional text representation

and fine-tune it on a specific opinion mining task: DeepMoji (Felbo et al., 2017)

11

How can we leverage this “lack” of manually annotated data ?

DeepMoji

12

DeepMoji: leverage the power of emoji to accurately encode the

emotional content of texts.

The power of emoji

Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm (Felbo et al., 2017)

DeepMoji

13

https://deepmoji.mit.edu/

This was soooo FUN !!! 😁😁 [this, was, soo, fun, !!] POSITIVE

→ Build a training set of 1.2B tweets with emojis as noisy labels

This was soooo FUN !!! 😁😁 [this, was, soo, fun, !!] 😁

→ Pre-train a model to predict an emoji probability distribution given a text

→ Fine-tune this model on a specific opinion mining task (sentiment analysis, emotion detection & sarcasm

detection)

The modelDeepMoji

14

2-layers BiLTSM with attention

Pre-training Transfer learning

Fine-tuning is done using the chain-thaw approach:

sequentially fine-tune one layer at a time

Advantages of DeepMojiDeepMoji

15

● SoTA on 3 opinion mining tasks (before BERT’s arrival)

● Really good fit for our use-case: opinion mining on social media posts

● Simple and easy-to-read code written in Keras to perform tests and reproduce results

Internal Challenges

16

Challengesreminder

Internal challenges

17

1. Perform opinion mining on many (>60) languages on every social media

platforms :

DeepMoji requires manually annotated data for each target task and for each

language

2. Handle at least 1200 documents per second without making the hardware

costs skyrocket :

We assume that a Bi-LSTM would not be an option

Multilingual problem

Computational problem

Limitations & resourcesInternal challenges

18

Research environnement

Production environnement

- Hardware: 4 GTX 1080 Ti

- Frameworks: Keras + Tensorflow

Tensorflow offers an “almost” stable

Java API (ONNX or DeepLearning4J not

mature yet)

- CPUs-only production instances

- Current processing pipeline on Apache

Storm (JVM) does not handle

batching

Not ideal for deep learning models

Our ideaUnimoji

19

Deep Learning is awesome !

J’adore mon nouvel iPhone

Detesto el fin de la Casa de Papel… NEGATIVE

POSITIVE

👍 0.35

😔 0.002Doc2Emoji

TRAINED ON ENGLISH

❤ 0.68

😢 0.001

😡 0.36

😂 0.005

POSITIVE

Emoji2SentimentTRAINED ON ENGLISH ANNOTATED CORPORA

Doc2EmojiTRAINED ON FRENCH

Doc2EmojiTRAINED ON SPANISH

Emojis are universal across the languages and are more and more used upon social media platforms

Proof of ConceptInternal challenges

20

● Validating the approach: use DeepMoji pre-trained (predicts emoji probability distribution) + MLP

(predicts sentiment from the distribution)

Small loss of accuracy compared to fine-tuned methods (2-5 points) → acceptable

● Reproduce DeepMoji pre-training on our own English data

● Issues:

1. 1 epoch: 12 days (too long)

2. Inference time in production: 50 ms/input (too slow)

Internal challenges

21

Tackling the computational problem

At this point:

● Impossible to use a RNN architecture in production● Need an alternative...

1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?

2. Can this emotional context representation using emojis be used to perform multilingual opinion mining tasks ?

Our approach: Unimoji

22

Doc2EmojiUnimoji

23

Different CNNs architectures tried

Final architecture is a combination of:

● Attentive convolutions (Yin, 2017)

● 2-layers CNN architecture used by SwissCheese team, winners of Task

1-A of SemEval2016 (Deriu et al., 2016)

Light attentive convolution layerDoc2Emoji architecture that we used

EN

FR

ES

Unimoji

24

Statistics:

● Dataset: 512M tweets

● Training: 44 h/epoch (vs 12 days/epoch)

● Predict in production: 5 ms/input (vs 50 ms/epoch)

Our architecture performed almost as good as DeepMoji

Is this representation accurate enough to resolve opinion mining tasks ?

Top 1 and 5 emoji prediction accuracies

EN

FR

ES

Doc2Emoji

Emoji2TaskUnimoji

25

Architecture: 2-layers neural network

Comparing the quality of learnt sentence representations: benchmarking over DeepMoji approach

1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?

EN

FR

ES

Emoji2TaskUnimoji

26

EN

FR

ES

2. Can this emotional context representation using emojis be used to perform multilingual opinion mining tasks ?

Train 3 new Doc2Emoji models: French, German, Simplified Chinese

Experiments: Sentiment analysis & Emotion detection

Multilingual Sentiment analysis

Unimoji

27

Training: SemEval 2016 Task 4-A dataset (3 classes: negative, positive, neutral)

Evaluation: internally annotated data in English, German, French & Chinese

Results: (vs previous algorithm)

● English accuracy improvement: ~ +10% (90% acc)

● French accuracy improvement: ~ +7% (87% acc)

● German accuracy improvement: ~ +6% (81% acc)

● Chinese accuracy improvement: ~ -30% (40% acc)

The multilingual approach improved the results for all languages except for Chinese

→ Emojis context in Chinese ≠ Emojis context in English

Unimoji

28

Multilingual Emotion detection

Love & sadness

Anger & disgust

Surprise

Training: SemEval 2018 Task 1-Ec dataset

We kept only 7 emotions : anger, disgust, fear, joy, love, sadness and surprise (multilabel classification)

Evaluation: internally annotated data in English, German, French

Results:

● English accuracy : 85%

● French accuracy: 80%

● German accuracy: 77%

Results → good enough to validate our approach

Unimoji

29

Validating our approach

2. Can this emotional context representation using emojis can be used to perform multilingual opinion mining tasks ?

*If the emotional context in which emojis are used is not too different from the context of the language in which the Emoji2Task was trained on.

*

1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?

Conclusion & Perspectives

30

So far...Conclusion & Perspectives

31

● Integrated our Unimoji model for sentiment analysis and emotion detection for 6

languages: French, English, Spanish, Portuguese, German and Italian

● For the Simplified Chinese model, Doc2Emoji model was fine-tuned on a Chinese

sentiment analysis dataset (improving accuracy by ~20%)

● Plan to add more languages to the model...

Key takeawaysConclusion & Perspectives

32

● 10x faster Doc2Emoji architecture based on CNNs with small accuracy loss

● Unimoji = Modular architecture: one can change the Doc2Emoji/Emoji2Task architectures

with any model

● 2 opinions mining tasks trained using the same English emoji probability distribution as

emotional representation:

○ Sentiment analysis (improving our inference accuracy)

○ Emotion detection (new feature !)

Key takeawaysConclusion & Perspectives

33

● Doc2Emoji can be fine-tuned for any language if a reliable manually annotated dataset is

available

● Such model have limitations: different emotional contexts for emoji, different emoji

distribution across 2 languages, ...

What’s next ?

34

Conclusion & Perspectives

● Add more languages

● Continue to explore limitations

● Don’t focus only on emojis

→ Explore Cross-lingual models (LASER, XLM)

● New opinion tasks

→ Saracasm detection, hate detection, optimism/pessimism, ...

35

Thank you !

Questions ?

LONDON

1 Primrose Street, London EC2A 2EXcontact-uk@linkfluence.com

DÜSSELDORF

Erkrather Straße 234b, 40233 Düsseldorfkontakt@linkfluence.com

SHANGHAI上海昌平路68号510-512室 近西苏州路Rm 512, 68 Changping Road, Shanghaicontact-asia@linkfluence.com

SINGAPORECapital Tower #12-01, 168 Robinson Road, 068912 Singaporecontact-asia@linkfluence.com

PARIS

5, rue Choron, 75009 Pariscontact@linkfluence.com

SAN FRANCISCO575 Market Street #11, San Francisco CA 94105contact@linkfluence.com

36

top related