deeplearning on big data - semantic scholar · • president barack obama's big data keynote...

70
Deep Learning on BigData Deep Learning on Big Data Aurelio Uncini [email protected] E degli INGEGNERI ORDIN E p rovincia di ANCONA Facoltà di Ingegneria dell'Informazione, Informatica e Statistica (I3S) http://ispac.diet.uniroma1.it Bologna, 07 Novembre 2013 Rome, July 2015 Ancona, June 12, 2015

Upload: others

Post on 10-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Deep Learning on Big‐DataDeep Learning on Big Data Aurelio [email protected]

de

gli INGEGNERI

OR

DIN

E

provincia di ANCONA

Facoltà di Ingegneria dell'Informazione, Informatica e Statistica (I3S) 

http://ispac.diet.uniroma1.it

Bologna, 07 Novembre 2013

Rome, July 2015Ancona,  June 12, 2015 

Page 2: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

ProloguePrologue

Page 3: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

A i t tl d th t ll l i i il• Aristotle argued that all people expressing similar

intellectual faculties and that the differences were due to

the teaching and example.

• My elementary school teacher said that “man is

intelligent because it has the ability to adapt”intelligent because it has the ability to adapt”.

• Bernard Widrow (LMS inventor): - “I'm an ‘adaptive’ guy”

Keywords: teaching example and adaptationKeywords: teaching, example and adaptation

Page 4: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

The Big Data PhenomenonThe Big Data Phenomenon

Page 5: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Exponential growth of available informationExponential growth of available information

• Social networks

• Sensor networks

• Internet of Things

• Bureaucratic and specific database

• Apps

• ….

Page 6: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Big Data cycleBig Data cycle

AApps

DataUsers

Data

Page 7: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Big Data many ‘V’

2020 about 44x1021 (44 zettabyte)

V lVolume

Velocity

Variability

Source: IDC’s Digital Universe Study (EMC)Variety

Page 8: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Big DataBig Data

• Untapped opportunities for socioeconomic growth World Economic Forum

D t i th il f th I t t d th• Data is the new oil of the Internet and the new currency of the digital world.

Meglena Kuneva, Europeang , pConsumer Commissioner

• Data in 21st Century is like Oil in the 18thlike Oil in the 18th Century: an immensely, untapped valuable asset.

Page 9: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Big Data - Many ‘V’ definitionBig Data Many V definitionBig problem: extraction of ‘V’alue from the large pools of datapools of data

Cost center Profit center

Harvesting of valuable knowledge from Big Data is not di t kan ordinary task

Today, machine learning methods, have come to play a vital role in Big Data analytics and knowledge discovery

Page 10: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Big-Data relevant themes

Computational IntelligenceMethods

Deep Learning MethodDeep Neural Nets

Data ConstraintMassive scaleDecentralizedReal - Time stream

InfrastructureMassive Scale Value

Deep Neural NetsConvolutive Neural NetsDistributed Neural NetsMeta heuristic…….

Real Time stream

Massive Scale Cloud Storage High speed networksHigh speed computers

Value BD business modelBD AnalyticsHigh-valueadded productsComputational model added products……..

TaskModelingPrediction

Computational modelAdaptiveParallelDistributedLocal connections‘G ’ Classification

Clustering ……..

‘Green’

Page 11: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

S f• Support projects that can transform our ability to harness in novel ways from huge volumes of digital data.

• In April, 2013, U.S. President Barack Obama announced another federal project, a new brain mapping initiative called the BRAIN (Brain Research Through Advancing Innovative Neurotechnologies).( g g g )

• President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks about the importance of Big Data and Data Science) (19(He talks about the importance of Big Data and Data Science) (19 feb 2015)

Page 12: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Biologically inspired computingBiologically inspired computing

Page 13: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Biologically inspired approach ....

InstinctKnowledgeExperienceCultureEmotionsA

MemoryA priori knowledge

Brain DeductionA ti Aware

...

A priori knowledgeRulesReasoning ability

Action

M f i ith th i f tiMoreover: fusion with other information ....

Page 14: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

most of our behaviors which combine information.... most of our behaviors, which combine information, knowledge and intelligence; happens unconsciously.

Ex. Complex scene summarization in a few words

Page 15: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Characteristics of the biological brain

D d it

The neuron cell

A T i l

Dendrites(receivers)

Axon Terminals(transmitters)Cell Body

NucleusStimuli Response

Page 16: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

• Birth of Artificial Neural Networks (ANN) (40s)• The formal neuron of McCulloch – Pitts (1943)e o a eu o o cCu oc tts ( 9 3)

( )s Non linear function

• Simple biological inspired circuit

1w

( )s

s

Non linear function

Threshold or bias

Synaptic weights w

Cell potential( i i )

11

2w Ts w x ( ) ( )Ty w x

s

Neuron'sinput x

(activation)1x

Stimuli Response

MxMw

( ) ( )y w

Activationfunction

Summing junction Axon

M

Dendrit

• Can be implemented by a very simple algorithm. Suitable for Artificial Neural Networks

Page 17: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Learning model and paradigmsLearning model and paradigms

• Learning model: simple rewarding mechanism

f• In general terms we can define two learning paradigms– Supervised

U i d– Unsupervised

Page 18: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Supervised learningLearning through teaching by examples

1 Rewarding_Functionn n w w

Stimuli Response

Supervisor or TeacherCorrect answerComparison

Rewardh imechanism Error [ ]ne

R di h i f tiExternal forcing

Rewarding mechanism: error function minimization provided through examples.

Page 19: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Learning by error correction• A learning algorithm with a concrete and useful results is the

LMS algorithm (Delta-rule) of Bernard Widrow (1959).

Learning by error correction

g ( ) ( )

w

Desired output (supervisor or teacher)

d

wResponse T

ny w xExternal stimuli (Signals)

x

Stimuli Error wLearning

wComparison

1n n w w x ealgorithm e d y

1n n

Bernard Widrow “I'm an ‘adaptive’ guy”Bernard Widrow “I'm an ‘adaptive’ guy”Professor Emeritus Electrical Engineering DepartmentSt f d U i itStanford UniversityUSA

Page 20: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Multi-Layer Neural Networksu t aye eu a et o s

Compare outputs withcorrect answer to getcorrect answer to geterror signal

Back - propagateerror signal toget derivativesf l i

y Outputs (3) (3) (2) (2) (1) (1)y Φ W Φ W Φ W x

for learning

(3)W

Many Hiddenlayers

(2)W

Feed - forwardcomputation

(1)W

x Input vector (pattern)

computation

Back-Propagation learning algorithm (mid 80s)

Page 21: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Unsupervised learningUnsupervised learning

Learning through self adaptation

Stimuli Response

N t l f iRewardingmechanism

Rewarding mechanism: simple primal instinct that creates the

No external forcingsmechanism

Rewarding mechanism: simple primal instinct that creates the adaptation i.e. natural evolutionary behavior

Page 22: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Unsupervised learning

Hebbian learning

Unsupervised learning

Hebbian learning

• Hebb’s Postulate• The strength of the connection depends on the activity between the

neurons.

Donald Hebb (Canadian psychologist 1904-1985)Donald Hebb (Canadian psychologist, 1904 1985)Canadian psychologist, McGill University, Montreal

Page 23: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Neural Networks History: Gartner Hype Cycley y y

• Neural Network Disillusionment

Peak of Infleted Expectationhype

Plateau of Productivityr med

ia

RNNPlateau of Productivity

atio

nso

MLPNNs Rebirth

Slope of Enlightenment

Trough of DisillusionmentExp

ecta

BP

Time

Technology Trigger

1950-70 ’80 ’90 ‘00 ’06 ‘10

Widrow’sLMS

Page 24: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

BP-NNs Disillusionment, 80 and 90BP NNs Disillusionment, 80 and 90

• Supervised learning p g– It requires labeled training data– Almost all data is unlabelled

• Long learning timeLong learning time– Very slow in networks with many hidden layers– Vanish gradient problemg p

• It may fall into poor local minimay p– For deep networks they may be too far from the

optimal solution

Page 25: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Back-propagation problems in the 80 and 90Back propagation problems in the 80 and 90

1 Difficulty of producing labelled training

Three main problems of BP

1. Difficulty of producing labelled training data set: not enough labelled data sets.

2. No fast enough CPU.

3. Difficulty of correct weights: propagation error problems.

Page 26: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

What has happened recently

1. Labelled data sets got much bigger.1. Labelled data sets got much bigger.

2 Computer got much faster2. Computer got much faster.

3 New paradigm for learning deep layers using3. New paradigm for learning deep layers using unlabeled data (2006).

• Result: deep neural networks are the now state-pof-the-art for many real world problems.

Page 27: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Deep Neural NetworksDeep Neural Networks

Page 28: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Neural Networks History: Gartner Hype CycleNeural Networks History: Gartner Hype Cycle

Peak of Infleted Expectationhype

Pl t f P d ti it ?r med

ia

MLP

RNN

NNs 2nd RebirthPlateau of Productivity?

atio

nso

BP Slope of Enlightenment

Trough of DisillusionmentExp

ecta BP DNN

Time

Technology Trigger

1950-70 ’80 ’90 ‘00 ’06 ‘10

Widrow’sLMS

DNN(industry)

DNN

Page 29: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Deep Neural Networks - Gartner Hype CycleDeep Neural Networks - Gartner Hype Cycle

hypothesized trend hy

peyp

DNN

Strong AIW A R N I N G

r med

ia W A R N I N G

Bill GatesStephen Hawking

atio

nso BP

Exp

ecta

Time1950-70 ’80 ’90 ‘00 ’06 ‘10

Widrow’sLMS

DNN

http://www.huffingtonpost.com/james-barrat/hawking-gates-artificial-intelligence_b_7008706.html

Page 30: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Machine Learning performance vs amount of dataMachine Learning performance vs amount of data

Deep learningmethods

Standard

man

ce

machine learningalgorithms

Per

form

Amount of data

Page 31: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Deep Learning definitionDeep Learning definition

• Many definitions:• Many definitions:• DL is a set of algorithms in machine learning that

attempt to learn in multiple levels, corresponding to p p , p gdifferent levels of abstraction. It typically uses artificial neural networks.

• DL is a class of machine learning techniques that exploit many layers of non-linear informationexploit many layers of non linear information processing for supervised or unsupervised feature extraction and transformation, and for pattern analysis

d l ifi tiand classification.

•• ….

Page 32: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

DL Biological evidenceDL Biological evidence• For example the layers organization of the visual system

Muscle cellsMotoneuronReceptors

External stimuli

Memory, ideation, psyche, etc.Hidden layers

Many levels of transformation

Page 33: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

DL Psychological cognitive evidence• The knowledge is represented in different levels of

abstraction

DL Psychological-cognitive evidence

abstraction  Abstraction

Wisdom

Insight

Understanding

Knowledge

Information

Data

Concreteness

The Ladder-of-Abstraction and the Data-Wisdom Pyramid

Page 34: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Example of Deep Learning solutions

• Apple - Siri speech recognition, iPhone personal assistant, …

• Facebook – massive data analysis, …

• Google - Translator, Android’s voice recognition, text processing Word2Vec,

(Google acquires AI startup Deep Mind > $500M), …

• IBM – brain-like computer, deep learning for Big Data, (IBM acquires

AlchemyAPI, Enhancing Watson’s Deep Learning Capabilities)…

• Microsoft – speech, massive data analysis, …

• Twitter – acquires Deep Learning startup Madbits

• Yahoo – acquires startup LookFlow to work on Flickr and Deep Learning

• As data keeps getting bigger DL coming to play a key role in:• Data modeling• Analytics solutions• Leverage for competitive advantage

Page 35: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Three main DNN families (L Deng D Yu 2014)Three main DNN families (L. Deng, D. Yu 2014)

• Deep networks for unsupervised or generative learning• Capture high-order correlation of the observed data when no

information about target class labels is available.

• Deep networks for supervised learning• Directly provide discriminative power for pattern classificationDirectly provide discriminative power for pattern classification

purposes.

• Hybrid deep networks• Mix of the previous models. The goal is discrimination which is

assisted, often in a significant way, with the outcomes ofassisted, often in a significant way, with the outcomes of generative or unsupervised deep networks.

• The research activities in the field is very high• The research activities in the field, is very high

Page 36: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Unsupervised generative model• Ex. Deep Belief Networks (DBN)

• Stack of Restricted Boltzmann Machines (RBM)• Stack of Restricted Boltzmann Machines (RBM)

IndependentIndependent unsupervised training of each layer.

O

H

Output layer

Hidden la er

(4)W

RBM

RBM

DBN can effectively utilize large amounts of

3

2

H

H

Hidden layer

Hidden layer

(3)W

RBM

RBM

large amounts of unlabeled data for exploiting complex data

2

1HHidden layer

(2)

(1)

W

WRBM

RBM

structures. VInput layer

(1)W

Page 37: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Deep networks for supervised learning• Ex. Convolutional Neural Network (CNN)

Yann LeCun (NYU)

Specific architecture for image classification

Yann LeCun (NYU)

Fig. from: http://parse.ele.tue.nl/cluster/2/CNNArchitecture.jpg

Biologically inspired - Small neuron collections which look at small portions of the input image, as the receptive fields.

Page 38: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Convolutional Neural Network (CNN)Softmax to predict object class

Fully-connected layers

Convolutional layers(same weights used at allspatial locations in layer)

Layer 7

spatial locations in layer)

Layer 1

…..

Biologically inspired - Small neuron collections which yneuron collections which look at small portions of the input image, as the receptive fields.

Input

Won 2012 ImageNet challenge with 16.4% top-5 error rate

Page 39: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Hybrid DNN architecture

(1)W(2)

Softmax classifier

(2)W( 1)N W ( )NW

Unsupervised learning Supervised learning

Supervised final fine tuning

Page 40: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

DNN by stacked autoencoderDNN by stacked autoencoder

Output

4P N 4P 3P4P 3P 2P 3P 2P 1P

pclasses

2P 1PN 4P 3P

Softmax classifier

(1)W (2)W (3)W (4)W

N(1)W (2)W (3)W (4)WW W W W ( )W W W W

Separate unsupervised pre-training of theSeparate unsupervised pre training of the hidden layers

Page 41: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Large Scale Deep Neural NetworkLarge Scale Deep Neural Network

Page 42: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Parallel and distributed computingParallel and distributed computingSM-MIMD

DM-MIMD

VectorSupercomputer

High-Speed Network

StorageWorkstation

Special Pourpose Architecture SIMD

Page 43: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks

• Problem: training a deep network with billions of• Problem: training a deep network with billions of parameters using tens of thousands of CPU cores.

• Exploit many kinds of parallelism

• Data parallelism

• Model parallelism

• Data and model parallelism

Page 44: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Large scale DNNLarge scale DNN• Model parallelism

e1

e2

Minimal network traffic:The most densely connected

th titi

Mac

hine

Mac

hine areas are on the same partition

chin

e3

chin

e4

Data

Mac

Mac

• Network partitions

Page 45: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Large scale SGDLarge scale SGD• Asynchronous Stochastic Gradient Descent (SGD) (Widrow’s

Generalized Delta Rule (GRD))Generalized Delta Rule (GRD))1n n n w w w Parameter Server

‘Downpour’ SGD(1). Model replicas asynchronously

nwnw

fetch parameters w and push gradients w to the parameter server.

n

ModelReplicasReplicas

Data Shards

Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, ‘Large Scale Distributed Deep Networks’, NIPS 2012.

Page 46: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Large scale L BFGSLarge scale L-BFGS• Limited-memory conjugate gradient algorithm of Broyden, Fletcher,

Goldfarb Shanno (L-BFGS)

1n n n w w w Parameter Server

Goldfarb,Shanno (L BFGS).

nwnwCoordinator (small messages)

ModelReplicas

L-BFGS-A: single ‘coordinator’ sends small messages to replicas and the parameter

Data

p pserver to orchestrate batch optimization.

Jeffrey Dean Greg S Corrado Rajat Monga Kai ChenJeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’AurelioRanzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, ‘Large Scale Distributed Deep Networks’, NIPS 2012.

Page 47: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

DNN on Big-Data applicationsDNN on Big Data applications

Page 48: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

DNN: state-of-the-art performance reported in l d iseveral domains

• Text, Language Model and Natural Language Processing

• Information Retrieval

• Visual Object Recognition and Computer Vision

• Speech Recognition and Audio Processing

• Multimodal and Multi-task Learning: Text-Image, S h ISpeech-Image, …

Page 49: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Text and Language processing

• Feedforward Neural Net Language Model

Input

Neural net language model ( )w t M

U

ProjectionOutputarchitecture

The training is done using

VU W

Hiddeng g

backpropagation

The word vectors are in matrix U( 2)w t

U( )w tThe word vectors are in matrix U

( 1)w t U

Page 50: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Text and Language processing

• Skip-gram ArchitectureOutput

Predicts the surrounding words given the current ( 2)w t

Input

Output

gword

( 1)w t

Input

HiddenLayers

( )w t

( 1)w t

( 2)w t

Page 51: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Text and Language processing

• Ex. Skipgram Text Model

Hierarchical softmaxClassifier

Single embedding function

Raw sparse features

Mikolov, Chen, Corrado and Dean. Efficient Estimation of Word Representations in Vector Space, http://arxiv.org/abs/1301.3781.

Page 52: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Text and Language processing

• Continuous Bag-of-words (CBOW) Architecture

Predicts the current word ( 2)

Input

given the context ( 2)w t

HiddenLayers Output( 1)w t

( )w t

Layers Output

( )w t

( 1)w t

( 2)w t

Page 53: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Example: GoogleExample: Google

• Neural network trained to predict a word given the words to it nearby.

• It allows you to create numerical representations of each word.

• These representations can be mathematically manipulate as classics vectors.

• Training carried out on database of hundreds of billions of• Training carried out on database of hundreds of billions of words.

http://deeplearning4j.org/word2vec.html

Page 54: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Example: GoogleExample: Google

• W2V : is a neural net that processes text before that text is handled by p ydeep-learning algorithms.

• W2V creates features without human intervention, including the context of individual words.individual words.

• W2V can make highly accurate guesses about a word’s meaning based on its past appearances.

• Word: ‘france’Word Cosine distance

-------------------------------------------spain 0.678515

b l i 0 665923belgium 0.665923netherlands 0.652428

italy 0.633130switzerland 0.622323luxembourg 0.610033

portugal 0.577154russia 0.571507

0 563291germany 0.563291catalonia 0.534176

Page 55: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Example: Google

• Here’s a graph of words associated with “China” using Word2vec

Example: Google

g p g

Page 56: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Example: GoogleExample: Google

• ‘Semantic computation’

• The word vectors capture many linguistic regularities, for example vector operations

p

vector('Paris') - vector('France') + vector('Italy')results in a vector that is very close to vector('Rome'),

and

t ('ki ') t (' ') t (' ') i lvector('king') - vector('man') + vector('woman') is close to vector('queen')

• W2V is the key element for the development of li ti f t ‘V’ l ti Bi D tapplications of great ‘V’alue, operating on Big-Data

Page 57: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Visual object recognition: GoogleNet(1)Visual object recognition: GoogleNetDeep Network1 billion connections, 9 - layered1 billion connections, 9 layered locally connected sparse autoencoder trained over a dataset of 10 million 200x200 pixel ofof 10 million 200x200 pixel of images downloaded from the Internet.

TrainingParallel Asynchronous StochasticyGradient Descent on a cluster with 1,000 machines (16,000 cores) for three daysthree days.

Image from (1)

(1) Q. Le, M.A. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, A. Ng ‘’Building high-level features using large scale unsupervised learning’’, International Conference in Machine Learning. 2012.

Page 58: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Visual object recognition (2014 Google)

Winner if 2014 ImageNet challenge with 6.66% top-5 error rate

• 6 modules of convolutional6 modules of convolutionallayers.

• 24 layers deep !!• 24 layers deep !!

• Good Fine-grainedClassificationClassification

• Good Generalization

• Sensible errors

Image from : http://www.engadget.com/2014/09/08/google-details-object-recognition-tech/

“Snake” “Dog”

Page 59: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Visual object recognition: Complex scene j g psummarization in a few words(1)

“Two pizzas sitting on top of a stove top oven”

(1) Google Research Bloghttp://googleresearch.blogspot.it/2014/11/a-picture-is-worth-thousand-coherent.html

Page 60: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Visual object recognition: Complex scene j g psummarization in a few words

Page 61: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

App NVIDIA’S DRIVE PX(1)App – NVIDIA S DRIVE PX(1)

Self Driving Cars Using Deep LearningSelf-Driving Cars Using Deep Learning

(1) http://dataconomy.com/nvidias-drive-px-platform-to-pave-way-for-self-driving-cars-using-deep-learning/

Page 62: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Industrial sector of interest• Topics include:

Banking / Retail / Finance– Identify: Prospective, customers Dissatisfied customers, Good customers, Bad payersIdentify: Prospective, customers Dissatisfied customers, Good customers, Bad payers– Obtain: More effective advertising, Less credit risk, Fewer fraud, Decreased churn rate– Finance: econometric, time series analysis and predictionBiomedical / Biometrics

M di i S i Di i d i D di ( ti M di i )– Medicine: Screening, Diagnosis and prognosis, Drug discovery (semantic Medicine)– Security: Face recognition, Signature / fingerprint / iris verification, speaker recognition,

DNA fingerprinting, …Computer / Internet / Multimedia p– Computer interfaces: Troubleshooting wizards, Handwriting and speech, Brain waves– Internet: Hit ranking, Text categorization, Text translation, Sentiment analysis, ….– Cyber security: Network anomaly, Cyber-attack prediction, Spam detection, Malicious

code recognitioncode recognition, …– Audio Video processing, audio-video content retrieval information, scene analysis, video

games, virtual movie, ...Electrical / Computer Engineering

Wireless communication Cognitive Radio Remote sensing Array processing multi– Wireless communication, Cognitive- Radio, Remote sensing, Array processing, multi sensor data fusions, robotics, Smart-Grid, intelligent house, ...

Data processing– Classification, Time series Filtering, Prediction, Regression, Clustering, Spam filtering,

S itSecurity …• Etc.

Page 63: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout
Page 64: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Research activity @y @

C t ti l i t lliComputational intelligence

Fast DNN model and architectures• Fast DNN model and architectures

• Random feature extractionRandom feature extraction

• Semi-supervised model

• Evolutionary methods for learning

• Distributed learning with Big Data

Page 65: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Research activity @y @

Ex Large Scale Distributed Learning on Big DataEx. Large Scale Distributed Learning on Big Data

• Development of learning algorithm without communication to a single central node and that can scale to large networks.

• The data are distributed on a network of interconnected agents.

• Applications including: learning on sensor networks, on peer-to-peer, swarms of robot, …

• Lynx toolbox: an open source Matlab toolbox, designed for fast prototyping of supervised machine learning simulations.

Page 66: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Highlights of Deep Learning on Big DataHighlights of Deep Learning on Big Data• DL can be used to merge symbolic and not symbolic heterogeneous information

• Development of parallel DL algorithms distributed on cluster of servers and/or parallel CPU (e.g. cuda GPU, …)

S i d d i d i d l i• Supervised and unsupervised mixed learning

• Possibility of continuous adaptation (learn while it is working)

• Possibility of customized solutions for specific problems

• Real-time data stream processingReal time data stream processing

• Order of weeks to train on large-scale datasets even on the fastest available GPUs

• Heuristic approach for the determination of the network topology

• Many tricks to make them learn optimally

• Developing applications with DL requires expertise and experience

Page 67: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

EpilogueEpilogue

Page 68: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

• Aristotele e il mio Maestro delle elementari, ,avevano già capito tutto ?

Page 69: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Conclusioni• Il problema della coscienza artificiale sembra costituire

l' lti tt d ll t i d ll'i i D d l

Conclusioni

l'ultimo atto della storia dell'ingegneria. Dando al termine ingegnere il termine estensivo di colui che fa, la costruzione di un artefatto in grado di poter dire Iola costruzione di un artefatto in grado di poter dire Io esisto potrebbe rappresentare il sogno finale dell'essere umano costruttore che vuole costruire anche senza sapere.

Ing. Vincenzo Tagliasco (1941-2008)

• Il mio gran male è stato sempre e sarà sempre uno:Il mio gran male è stato sempre e sarà sempre uno: quello di desiderare e sognare, invece di volere e fare.

Ing. Carlo Emilio Gadda (1893-1973)

Page 70: DeepLearning on Big Data - Semantic Scholar · • President Barack Obama's Big Data Keynote -- Hadoop World 2015 (He talks(He talks about the importance of Big Data and Dataabout

Q ti ?• Questions?