deep learning intro - georgia tech - cse6242 - march 2015

Deep Learning

What Do Electric Sheep Dream About?

Georgia Tech – CSE6242 – March 2015Josh Patterson

Presenter: Josh Patterson

• Email:– [email protected]

• Twitter: – @jpatanooga

• Github: – https://github.com/

jpatanooga

Past

Published in IAAI-09:

“TinyTermite: A Secure Routing Algorithm”

Grad work in Meta-heuristics, Ant-algorithms

Tennessee Valley Authority (TVA)

Hadoop and the Smartgrid

Cloudera

Principal Solution Architect

Today: Patterson Consulting

https://github.com/jpatanooga

Topics

• What is Deep Learning?

• Types of Deep Networks

• Tools and Resources

WHAT IS DEEP LEARNING?

“Cooper: [When Cooper tries to reconfigure TARS] Humour 75%.TARS: 75%. Self destruct sequence in T minus 10, 9, 8...Cooper: Let's make it 65%.TARS: Knock, knock.”

--- Interstellar

What is Deep Learning

• Deep Belief Networks: “Exotic Neural Networks”– Layers of Restricted Boltzmann Machines (RBMs)– A traditional feed-forward neural network

• RBMs learn progressively more complex features– Transfer features over to “regular” neural network

• Shown to be very powerful in domain benchmarking (winning most)– Audio– Image– text

We Want to be able to recognize Handwriting

This is a Hard Problem

We can see what Deep Belief Networks are thinking as they learn

(Electric Sheep Do Dream)

These are the features learned at each neuron in a Restricted Boltzmann Machine (RBMS)

These features are passed to higher levels of RBMs to learn more complicated things.

Part of the “7” digit

We can also ask a RBM directly what it thinks it learned as it

learns…

Lower Cross Entropy is Better

Deep Learning as Automated Feature Engineering

• Deep Learning can be thought of as workflows for automated feature construction– Where previously we’d consider each stage in the

workflow as unique technique

• Many of the techniques have been around for years– But now are being chained together in a way that

automates exotic feature engineering

• As LeCunn says:– “machines that learn to represent the world”

ARCHITECTURES

Deep Learning

Deep Learning Architectures

• Deep Belief Networks

• Convolutional Neural Networks

• Recurrent Networks

• Recursive Networks

Deep Belief Networks

• Layers of Restricted Boltzmann Machines (RBM)– Along with a canonical FeedForward /

Backpropagration Neural Network

• Layers of RBMs learn progressively higher order features from input data (Pre-Train phase)– Weights are used to initialize the feedforward network

• Feedforward network then uses “gentle backpropagation”– FineTune phase

Convolutional Networks

• Learns higher order features through layers of convolutions

• Feature map Layer (Convolution)– consists of 2 layers

• the previous layer which forms a receptive field mapping to the input and the output being a retina layer.

• The retina layer is what ties the receptive fields together to form the output.

• These outputs are typically called filters

• Pooling Layer– Consolidates feature maps for next layer

• Output Layer– Where we do classification

Recurrent Networks

• Like Feedforward networks

– But can have loops in the connections

• Allowing connection loops from the output layer back to the hidden layers

– makes recurrent neural networks applicable to tasks like unsegmented connected handwriting recognition

– Timeseries / Temporal Effects

Recursive Networks

• Can deal with variable length input

– Like recursive

• Primary difference with recursive:

– Can model hierarchial structures

• Has the ability to label objects in a scene

– Interesting applications in image decomposition

RESOURCES

Tools and

DL4J

• “The Hadoop of Deep Learning”– Command line driven– Java, Scala, and Python APIs

• ASF 2.0 Licensed• Java implementation

– Parallelization / GPU support

• Runtime Neutral– Local– Hadoop / YARN– Spark– AWS

• https://github.com/deeplearning4j/deeplearning4j

https://github.com/deeplearning4j/deeplearning4j

A Parting Thought

• Our terminology in data science has gotten more exotic– But its still about gather, cleaning, visualizing, and feature

construction of data

• We need to get data from a raw format into a baseline raw vector– Which is why Canova exists

• To feed raw data into a form DL4J can consume

• Deep Learning is not just classification– But an automated feature construction pipeline

• capped by a classifier

• Together, DL4J and Canova give us the full workflow

Questions?

• Thank you for you time and attention

deep learning intro - georgia tech - cse6242 - march 2015

Data & Analytics