deep learning...

42
Deep Learning Applications Fall 2018

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Deep Learning Applications

Fall 2018

Page 2: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Outline

► Problem Definition

► Background & Related Works

► Proposed Method

► Experimental Results

► Conclusion &Future Works

Page 3: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Problem Definition

3

Page 4: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Introduction

► What is Multimodal Data?

►Multiple channels of input

►Multiple views of a same concept

A Red Bird in a jungle.

پرنده‌ی‌قرمز Red Bird الطائر‌األحمر 红鸟 Красная Птица

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 5: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Applications

►Help inter-modal retrieval.

►Help intra-modal retrieval.

►Help Classification or clustering.When you type “مطمئن” Google search

engine retrieves this image.

When you search similar images for left image Google search engine retrieves right image.

Sport Delicious

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 6: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Challenges

►Distinct Modality Specific Properties.

►High Correlation Between Modalities.

►Higher intra-modality than inter-modality correlation.

Man eating apes

Man-eating apes

Red apple on the book

More Correlation

Less Correlation

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 7: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Problem Formulation

►Inputs:

► Two modalities like X and Z

►Goals:

► Extracting the most informative representation from X and Z

►Ability to generate missing modality from the present one

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 8: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Background

8

Page 9: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Deep Neural Networks

►Traditional neural networks using:

►More training data

►Deeper Architecture

► Better Optimization algorithms

►Popular Deep Neural Networks:

► Stacked Denoising Auto-encoders

► Recurrent Neural Networks

►Generative Adversarial Networks

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 10: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38De-noising Auto-encoders

►Corrupt Input data with the noise of its own.

► Try to find a representation for corrupted version of data in order to

reconstruction has the most information about clean input.

𝑿෩𝑿

𝒁𝒀 𝑰(𝑿, 𝒁)

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 11: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Stacking Auto-encoders (SAE)

►Extract high level representation by stacking auto-encoders in a

deep manner.

𝑿

𝒀

𝑿′𝒀’

𝒁

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 12: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Recurrent Neural Networks (RNNs)

► Feedforward networks with additional recurrent edges

► Powerful for sequential data like sentences

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 13: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Generative Adversarial Networks (GANs) [3]

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

[3] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

Page 14: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Related Works

14

Page 15: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Multimodal Deep Learning [1]

► Use two modality-specific auto-encoders and a joint layer on top of them

Train network in order to reconstruct every modality from the other and itself.

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

[1] Ngiam, Jiquan, et al. "Multimodal deep learning." Proceedings of the 28th international conference on machine learning (ICML-11). 2011.

Page 16: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38MDL-CW: A multimodal deep learning framework with cross weights [2]

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

[2] Rastegar, Sarah, et al. "Mdl-cw: A multimodal deep learning framework with cross weights." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.

Page 17: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Generative Adversarial Text to Image Synthesis [4]

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

[4] Reed, Scott, et al. "Generative adversarial text to image synthesis." arXiv preprint arXiv:1605.05396 (2016).

Page 18: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Prior Works

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Approach Pros. Cons.

SAE(Ng 2011, Sohn 2014)

Simple implementation Discarding low level interactions

MDL-CW(Rastegar 2016)

Considers lower level interactions Non-generative

RNN(Socher 2013, Karapathy

2014, Karapthy 2015)

Considers Sentence Structure Convergence problem

GAN(Reed 2016)

Generative Memorization

Page 19: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Proposed Method

19

Page 20: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Shadow Networks

►Train a network to extract when a certain class is absent

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 21: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Relativeness

►Two relative data are similar in one particular sense

► Binary Relativeness

► Fuzzy Relativeness

►Relativeness is a function of representation level

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 22: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Representation Binding by degree K

►For each of K final Layers the representation for two relative data are

the same

►Relativeness is a function of level so two relative data in a level can

be irrelative in other levels

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 23: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Binding Representations for both networks

►Main network:

► For each level choose nearest neighbors among relatives from higher layer

► Bind the representation in this layer for these relatives

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Relatives

Layer

Horses Dark HorsesDark Arabic

Horses

Final Before final Two Before final

Page 24: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Binding Representations for both networks

►Shadow network:

► For each level choose farthest neighbors among relatives from higher layer

► Bind the representation in this layer for these relatives

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Relatives

Layer

Non-Horses Dog, Plane, Table, …

Final Before final

Page 25: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Cross Edges

►Learn cross edge weights between shadow and main networks

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 26: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Representation Gating

►Three representations are available from lower layer representations

►Using modality presence signals to deduce final representation

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Gate

Same modality

Cross modality

Cross modality shadow

Modality Presence signals

Higher level representation

Higher Same modality

Higher Cross modality

Page 27: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Experimental Results

27

Page 28: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Experimental Results

►We have used PASCAL-Sentence for this section:

► Each image annotated by 5 sentences

► 500 train and 500 test images

► 1408 textual features

► 260 visual features

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 29: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38PASCAL-Sentence Dataset Experiments

Text to whole Image and Text Image to whole Image and TextPro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

[1] [6][7][9][12][10]

Page 30: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Qualitative Results

Image to wholeImage & TextText to wholeImage & Text

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 31: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Conclusion

31

Page 32: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Conclusions

►Using Shadow networks allows us to detect non-existence of topics

►Using Representation binding leads to better generalization

►Gating representations preserve informative representation and do

not corrupt it with weaker representations

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 33: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Future Works

33

Page 34: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Creation and Deception

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Main

Main Generator

Shadow

Shadow Generator

Creation CreationDeception Deception

Page 35: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Creation

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

►Main Generator generates unreal data which has desired label

Page 36: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Deception

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

►Shadow generator generates data which deceive the main network

to a wrong label

Page 37: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Future Works

►Neuron augmentation

►Using RNNs to distinguish between creation and deception

►Implementing brain cognitive functions

►Implementing social interactions between networks

Pro

ble

m

Def

init

ion

Rel

ate

d

Wo

rks

Pro

po

sed

Met

ho

d

Ex

per

imen

tal

Res

ult

s

Co

ncl

usi

on

&

Fu

ture

Wo

rks

Page 38: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

Thank You!

Page 39: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38References

1. J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, “Multimodal deep learning,” in Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011, pp. 689–696.

2. S. Rastegar, M. Soleymani M, H. R. Rabiee ,S. M. Shojaee, “Mdl-cw: A multimodal deep learning framework with cross weights” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2601-2609.

3. N. Srivastava and R. R. Salakhutdinov, “Multimodal learning with deep boltzmann machines,” in Advances in neural information processing systems, 2012, pp. 2222–2230.

4. R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng, “Grounded compositional semantics for finding and describing images with sentences,” Transactions of the Association for Computational Linguistics, vol. 2, pp. 207–218, 2014.

5. K. Sohn, W. Shang, and H. Lee, “Improved multimodal deep learning with variation of information,” in Advances in Neural Information Processing Systems, 2014, pp. 2141–2149.

6. A. Karpathy, A. Joulin, and F. F. F. Li, “Deep fragment embeddings for bidirectional image sentence mapping,” in Advances in neural information processing systems, 2014, pp. 1889–1897.

7. R. Socher, C. C. Lin, C. Manning, and A. Y. Ng, “Parsing natural scenes and natural language with recursive neural networks,” in Proceedings of the 28th international conference on machine learning (ICML-11), 2011, pp. 129–136.

8. M. Rastegari, J. Choi, S. Fakhraei, D. Hal, and L. Davis, “Predictable dual-view hashing,” in Proceedings of The 30th International Conference on Machine Learning, 2013, pp. 1328–1336.

9. B. Ozdemir and L. S. Davis, “A probabilistic framework for multimodal retrieval using integrative indian buffet process,” in Advances in Neural Information Processing Systems, 2014, pp. 2384–2392.

10. P. L. Lai and C. Fyfe, “Kernel and nonlinear canonical correlation analysis,” International Journal of Neural Systems, vol. 10, no. 05, pp. 365–377, 2000.

11. Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Advances in neural information processing systems, 2009, pp. 1753–1760.

Page 40: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38References

11. Y. Gong and S. Lazebnik, “Iterative quantization: A procrustean approach to learning binary codes,” in IEEE Conferenceon Computer

Vision and Pattern Recognition (CVPR). IEEE, 2011, pp. 817–824.

12. Frome, Andrea, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, and Tomas Mikolov. "Devise: A deep visual-semantic embedding

model." In Advances in Neural Information Processing Systems, pp. 2121-2129. 2013.

13. A. Gionis, P. Indyk, R. Motwani et al., “Similarity search in high dimensions via hashing,” in VLDB, vol. 99, 1999, pp. 518–529.

14. G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Džeroski, “An extensive experimental comparison of methods for multilabel learning,”

Pattern Recognition, vol. 45, no. 9, pp. 3084–3104, 2012.

Page 41: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38Multimodal Deep Boltzmann Machine [1]

Page 42: Deep Learning Applicationsce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-28.… · Man eating apes Man-eating apes Red apple on the book More Correlation Less Correlation

/ 38MDL-CL: A Multimodal Deep Learning Framework with Cross Layers