tutorial on deep learning

59
Tejas Kulkarni 1

Upload: inside-bigdatacom

Post on 21-Jan-2018

1.453 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Tutorial on Deep Learning

Tejas Kulkarni!1!

Page 2: Tutorial on Deep Learning

Introduction of Deep Learning!

Zaikun XuUSI, Master of Informatics

HPC Advisory Council Switzerland Conference 2016

2!

Page 3: Tutorial on Deep Learning

Overview!Introduction to deep learning!

Big data + Big computational power (GPUs)!

Latest updates of deep learning!

Examples!

Feature extraction!

3!

Page 4: Tutorial on Deep Learning

Machine Learning!

source : http://www.cs.utexas.edu/~eladlieb/RLRG.html!

Reinforcement learning

Introduction to deep learning!

4!

Page 5: Tutorial on Deep Learning

Inspiration from Human brain!

Introduction to deep learning!

•  Over 100 billion neurons !•  Around 1000 connections !•  Electrical signal transmitted through axon, can be inhibitory

or excitatory!•  Activation of a neuron depends on inputs!

5!

Page 6: Tutorial on Deep Learning

Artificial Neuron!

•  Circles —————————- Neuron!

source : http://cs231n.stanford.edu/!

•  Arrow —————————— Synapse!•  Weight —————————- Electrical signal!

Introduction to deep learning!

6!

Page 7: Tutorial on Deep Learning

Nonlinearity!Introduction to deep learning!

•  Neural network learns complicated structured data with non-linear activation functions!

7!

Page 8: Tutorial on Deep Learning

Perceptron!

•  Perceptron with linear activation!

Introduction to deep learning!

•  In1969, cannot learn XOR function and takes very long to train!

Winter of AI!

8!

Page 9: Tutorial on Deep Learning

Back-Propogation!

•  1970, got BS on psychology and keep studying NN at University of Edinburg !

Introduction to deep learning!

•  1986, Geoffrey Hinton and David Rumelhart, Nature paper "Learning Representations by Back-propagating errors”!

•  Hidden layers added!

•  Computation linearly depends on # of neurons !

•  Of course, computers at that time is much faster!

9!

Page 10: Tutorial on Deep Learning

Convolutionary NN!

•  Yann Lecun, born at Paris, post-doc of Hinton at U Toronto.!

Introduction to deep learning!

•  LeNet on hand-digital recognition at Bell Labs, 5% error rate!

•  Attack skill : convolution, pooling!

10!

Page 11: Tutorial on Deep Learning

Local connection!

11!

Introduction to deep learning!

Page 12: Tutorial on Deep Learning

Weight sharing!

12!

Introduction to deep learning!

Page 13: Tutorial on Deep Learning

Convolution!

•  Feature map too big!

13!

Introduction to deep learning!

Page 14: Tutorial on Deep Learning

Pooling!

14!

Introduction to deep learning!

http://vaaaaaanquish.hatenablog.com/entry/2015/01/26/060622!

Page 15: Tutorial on Deep Learning

Support Vector Machine!

Introduction to deep learning!

•  Vladmir Vapnik, born at Soviet Union, colleague of Yann Lecun , invented SVM at 1963!

•  Attack skill : Max-margin + kernel mapping!

15!

Page 16: Tutorial on Deep Learning

Another winter looming!Introduction to deep learning!

•  SVM : good at “capacity control”, easy to use, repeat!

•  NN : good at modelling complicated structure!

•  SVM achieved 0.8% error rate as 1998 and 0.56% in 2002 on the same hand-digit recognition task. !

•  NN stops at local optima, hard and takes long to train, overfitting!

16!

Page 17: Tutorial on Deep Learning

Recurrent Neural Nets!Introduction to deep learning!

http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/!

17!

Model sequential data!

Page 18: Tutorial on Deep Learning

Vanish gradient!Introduction to deep learning!

18!

Page 19: Tutorial on Deep Learning

LSTM!

•  http://www.cs.toronto.edu/~graves/handwriting.html!19!

Introduction to deep learning!

Page 20: Tutorial on Deep Learning

10 years funding!

•  Hinton was happy and rebranded its neural network to deep learning. The story goes on …………..!

Introduction to deep learning!

20!

Page 21: Tutorial on Deep Learning

Other Pioneers!Introduction to deep learning!

21!

Page 22: Tutorial on Deep Learning

Data preparation!•  Input : Can be image, video, text, sounds….!

•  Output : Can be a translation of another language, location of an object, a caption describing a image ….!

Introduction to deep learning!

Training data! Validation data ! Test data !

Model!

22!

Page 23: Tutorial on Deep Learning

Training !•  Forward pass, propagating information from input to

ouput!

Introduction to deep learning!

23!

Page 24: Tutorial on Deep Learning

Training!•  Backward pass, propagating errors back and

update weights accordingly!

Introduction to deep learning!

24!

Page 25: Tutorial on Deep Learning

Parameter update!•  Gradient descent (sgd, mini-batch)!

Introduction to deep learning!

animation by Alec Radford!

•  x = x - lr * dx!

source : http://cs231n.stanford.edu/!

25!

Page 26: Tutorial on Deep Learning

Parameter update!•  Gradient descent!

Introduction to deep learning!

•  x = x - lr * dx!

animation by Alec Radford)!

•  Second order method ? Quasi-Newton’s method!26!

Page 27: Tutorial on Deep Learning

Problems!•  While neural networks can approximate any

function of input X to output y, it is very hard to train with multiple layers!

•  Initialization!

Introduction to deep learning!

•  Pre-training!•  More data/Dropout to avoid overfitting!•  Accelerate with GPU!•  Support training in parallel!

27!

Page 28: Tutorial on Deep Learning

ImageNet!

•  source :https://devblogs.nvidia.com/parallelforall/nvidia-ibm-cloud-support-imagenet-large-scale-visual-recognition-challenge/!

Big data + Big computational power!

•  >1M well labeled images of 1000 classes!

28!

Page 29: Tutorial on Deep Learning

Big data + Big computational power!

29!

Page 30: Tutorial on Deep Learning

GPUs!

•  source :http://techgage.com/article/gtc-2015-in-depth-recap-deep-learning-quadro-m6000-autonomous-driving-more/!

Big data + Big computational power!

30!

Page 31: Tutorial on Deep Learning

Parallelization!•  Data Parallelization: same weight but different data,

gradient needs to be synchronized!

Big data + Big computational power!

•  Not scale well with size of cluster!

•  Works well on smaller clusters and easy to implement !

source : Nvidia!31!

Page 32: Tutorial on Deep Learning

•  Model Parallelization: same data, but different parts of weight!

Parallelization!Big data + Big computational power!

•  Parameters in a big Neural net can not fit into memory of a single GPU!

Figure from Yi Wang! 32!

Page 33: Tutorial on Deep Learning

Model+Data Parallelization!Big data + Big computational power!

Krizhevsky et al. 2014!

computation!heavy!

weight heavy!

33!

Page 34: Tutorial on Deep Learning

!

•  Trained on 30M moves + Computation + Clear Reward scheme!

Big data + Big computational power!

34!

Page 35: Tutorial on Deep Learning

•  Task : A classifier to recognize hand written digits!

Feature extraction!

source :https://indico.io/blog/visualizing-with-t-sne/!

35!

Page 36: Tutorial on Deep Learning

Hierarchal feature extraction!

Feature extraction!

•  source : http://www.rsipvision.com/exploring-deep-learning/!36!

Page 37: Tutorial on Deep Learning

Transfer learning!•  Use features learned on ImageNet to tune your only classifier on

other tasks.!

Feature extraction!

•  9 categories + 200, 000 images!

•  extract features from last 2 layers + linear SVM!37!

Page 38: Tutorial on Deep Learning

Word embedding!•  Embed words into vector space such that similar words are more

close to each other in vector space!

Feature extraction!

source : http://anthonygarvan.github.io/wordgalaxy/!

•  Calculate cosine similarity !Pairs - France+ Italy = Rome!

38!

Page 39: Tutorial on Deep Learning

Examples!

39!

Page 40: Tutorial on Deep Learning

Go!

•  3361 different legal positions, roughly 10170!

Examples!

Brute Force does not work !!!!

•  Monto Carlo Tree Search !Heuristic Searching method!Still slow since the search tree is so big!

40!

Page 41: Tutorial on Deep Learning

AlphaGo!Examples!

•  Offline learning : from human go players, learn a Deep learning based policy network and a rollout policy

•  Use self-playing result to train a value network, which evaluate the current situation!

41!

Page 42: Tutorial on Deep Learning

Online Playing!Examples!

•  When playing with Lee Sedol, policy network give possible moves (about 10 to 20), the value network calculates the weight of position.!

•  MCTS update the weight of each possible moves and select one move.!

•  The engineering effort on systems with CPU and GPU with data , model parallelism matters

42!

Page 43: Tutorial on Deep Learning

Image recognition!Examples!

Train in a supervised fashion !43!

Page 44: Tutorial on Deep Learning

Caption generation!Examples!

44!

Page 45: Tutorial on Deep Learning

Neural MT!

•  The main idea behind RNNs is to compress a sequence of input symbols into a fixed-dimensional vector by using recursion.!

Examples!

source : https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/!45!

Page 46: Tutorial on Deep Learning

Neural arts!Examples!

46!

Page 47: Tutorial on Deep Learning

Go deeper !Updates of deep learning!

•  Deep Residual Learning for Image Recognition, MSRA, 2015!47!

Page 48: Tutorial on Deep Learning

Go multi-modal!Updates of deep learning!

48!

Page 49: Tutorial on Deep Learning

Go End-End!Updates of deep learning!

source : https://devblogs.nvidia.com/parallelforall/deep-speech-accurate-speech-recognition-gpu-accelerated-deep-learning/#.VO56E8ZtEkM.google_plusone_share!

49!

Page 50: Tutorial on Deep Learning

Go with attention!Updates of deep learning!

50!

Page 51: Tutorial on Deep Learning

Go with Specalized Chips!Updates of deep learning!

•  Prototypes!

51!

Page 52: Tutorial on Deep Learning

Go where!•  Video understanding!

Updates of deep learning!

•  Deep reinforcement learning!

•  Natural language understanding!

•  Unsupervised learning !!!!

52!

Page 53: Tutorial on Deep Learning

Action classification!

53!

Page 54: Tutorial on Deep Learning

54!

Page 55: Tutorial on Deep Learning

55!

Page 56: Tutorial on Deep Learning

Summary!•  “The takeaway is that deep learning excels in tasks

where the basic unit, a single pixel, a single frequency, or a single word/character has little meaning in and of itself, but a combination of such units has a useful meaning.” — Dallin Akagi!

•  Structured data + well defined cost function/rewards !

•  learning in a End-End fashion is nice trend!

56!

Page 57: Tutorial on Deep Learning

Thoughts!•  Deep leaning will change our lives in a positive way!

•  We learn fast by either competing with friend like AlphaGo, or they directly teach us in a effective way

•  What might happen in the future!

•  Still long way to go towards real AI!

57!

Page 58: Tutorial on Deep Learning

list of materials!

•  https://github.com/ChristosChristofidis/awesome-deep-learning!

58!

Page 59: Tutorial on Deep Learning

Acknowledgement!

•  Tim Dettmers!

•  Lyu xiyu!

59!