diving deep into sentiment: understanding fine-tuned cnns for visual sentiment prediction

Post on 21-Jan-2018

1.057 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DIVING DEEP INTO SENTIMENT: UNDERSTANDING FINE-TUNED CNNS FOR VISUAL SENTIMENT PREDICTION

Víctor Campos Xavier Giró Amaia Salvador Brendan Jou

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

2

Introduction: motivation

3

4

Introduction: problem definition▷ What? ▷ How?

▷ What? Predict the sentiment that an image provokes to a human▷ How?

5

Introduction: problem definition

▷ What? Predict the sentiment that an image provokes to a human▷ How?

6

Introduction: problem definition

▷ What? Predict the sentiment that an image provokes to a human▷ How? Using Convolutional Neural Networks (CNNs)

7

CNN

Introduction: problem definition

8

CNN

Introduction: example

9

CNN

Introduction: example

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

10

Related work: low-level descriptors

11

Siersdorfer, S., Minack, E., Deng, F., & Hare, J. (2010, October). Analyzing and predicting sentiment of images on the social web. In Proceedings of the international conference on Multimedia (pp. 715-718). ACM.

Machajdik, J., & Hanbury, A. (2010, October). Affective image classification using features inspired by psychology and art theory. In Proceedings of the international conference on Multimedia (pp. 83-92). ACM.

12

Borth, D., Ji, R., Chen, T., Breuel, T., & Chang, S. F. (2013, October). Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM international conference on Multimedia (pp. 223-232). ACM.

Related work: SentiBank

Related work: CNNs for sentiment prediction

13

You, Q., Luo, J., Jin, H., & Yang, J. (2015). Robust image sentiment analysis using progressively trained and domain transferred deep networks. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI).

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

14

Convolutional Neural Networks

15

Krizhevsky, A.; Sutskever, I. & Hinton, G. E.: ImageNet Classification with Deep Convolutional Neural Networks. In: NIPS., 2012

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

16

Datasets

17

Flickr Twitter

Authors Borth et al. (2013) You et al. (2015)

Size ~500k 1269

Annotation method Textual tags5 human

annotators

Datasets

18

Size

Flickrdataset

Quality of the annotations

Twitter5-agreedataset

Datasets

19

Size

Flickrdataset

Quality of the annotations

Twitter5-agreedataset

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

20

21

ARCHITECTURECaffeNet

Experimental setup: CNN

22

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

Experimental setup: CNN

Experimental setup: CNN

23

Pre-trainedModel

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

DATASET[Deng’09]

Experimental setup: CNN

24

Model

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

DATASET[Deng’09]

DATASET[You’15]

Twitter 5-agree

+Fine-tuning

Pre-training

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

25

Fine-tuning CaffeNet

26

Fine-tuning CaffeNet

27

Fine-tuning CaffeNet

28

Fine-tuning CaffeNet

29

Pre-trainedmodel

Fine-tuning CaffeNet

30

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

31

Layer by layer analysis

32

Layer by layer analysis

33

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

34

Layer ablation

35

Raw ablation

2-neuron on top

Layer ablation

36

Layer ablation

37

Layer ablation

38

~16Mparams(~25%)

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

39

Layer addition

40

FC8: semantic information

Layer addition

41

FC8: semantic information

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

42

Conclusions

43

Pre-trainedmodel

44

CNN

Conclusions

Conclusions

45

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

46

Future work

47

Size

Flickrdataset

Quality of the annotations

Twitterdataset

Future work

48

Size

Flickrdataset

Quality of the annotations

Twitterdataset

MVSOdataset

(†) B. Jou*, T. Chen*, N. Pappas*, M. Redi*, M. Topkara*, and S.-F. Chang. Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology. ACM Int'l Conference on Multimedia (MM), 2015.

49

Model

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

DATASETMVSO [Jou’15]

Future work

Acknowledgements

50

Financial supportTechnical support

Albert Gil Josep Pujal

Data augmentation (oversampling)

53

CNN

Data augmentation (oversampling)

54

CNN

Data augmentation (oversampling)

55

CNN

Data augmentation (oversampling)

56

CNN

Data augmentation (oversampling)

57

CNN

Data augmentation (oversampling)

58

CNN

Data augmentation (oversampling)

59

CNN

top related