random projection neural networks: algorithms &...

36
Random Projection Neural Networks: Algorithms & Hardware Jonathan Tapson Arindam Basu 1 Telluride 2015

Upload: votu

Post on 06-Feb-2018

266 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projection Neural Networks: Algorithms &

Hardware

Jonathan TapsonArindam Basu

1

Telluride 2015

Page 2: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

All of Machine Learning in 2 Slides• We generally use machine learning for two purposes: Given input data X

and corresponding output data Y, find a function of X that either – Classification: classify each sample by finding a separating hyperplane so that 

y {0,…,N} for N classes– Regression: Finds a function y = f(x) that minimises some error condition 

• If there is a linear solution, we don’t need machine learning

• For a nonlinear solution, we use a machine learning method that:– Projects the data into a higher dimensional space– The projection must be nonlinear, so as to create separations which did not 

exist in the original space– We can then solve using a linear solution in the higher dimensional space

2 Jonathan Tapson, The MARCS Institute

Page 3: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Machine Learning

Jonathan Tapson, The MARCS Institute3

y

xProject into higher dimensional space (add z dimension) non‐linearly, e.g.: z = x2

z

x

y

Page 4: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Machine Learning

Jonathan Tapson, The MARCS Institute4

y

xFind separating hyperplane:

z

x

y

Page 5: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Neural Networks

5

• Single Layer (Perceptron)• Linear classification only • Easy to train

Page 6: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Neural Networks

6

• Single Layer (Perceptron)• Linear classification only • Easy to train

• Multiple Layer (Multi-layer perceptron)• Universal approximator • Difficult to train

Page 7: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Neural Networks

7

• Single Layer (Perceptron)• Linear classification only • Easy to train

• Multiple Layer (Multi-layer perceptron)• Universal approximator • Difficult to train

• Support vector machines (SVM)• Maximise margin of classification

• Combination of weak classifiers• Voting, Adaboost

Page 8: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Neural Networks

8

• Single Layer (Perceptron)• Linear classification only • Easy to train

• Multiple Layer (Multi-layer perceptron)• Universal approximator • Difficult to train

• Support vector machines (SVM)• Maximise margin of classification

• Combination of weak classifiers• Voting, Adaboost

• Random projection based neural networks• Universal approximator • Fast training • Easy to implement (?)

Page 9: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projection Neural Networks-I

9

• Large number of randomly weighted connections—not trained.

• Only output weights are trained (linear decoding)—very fast training.

• Example: Extreme Learning Machine (ELM), Neural Engineering Framework (NEF)

Input dimension: d

Hidden layer dimension: L(L>>d)

Page 10: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projection Neural Networks-II

10

• Another example: Reservoir computing (Liquid state machine, Echo state network)

• Recurrence encodes time history implicitly.• Can be done explicitly in ELM.

Page 11: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

ELM: Algorithms and Applications

Page 12: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Extreme Learning Machine (ELM)

• Multi-class regression & classification.

• Quick training, good generalization.

• Exploit fixed random weights of 1st layer for VLSI implementation.

Guang-Bin Huang, et. al., “Extreme learning machine for regression and multiclass classification,” IEEE Transactions SMC-B, 2012G. B. Huang, Q. Y. Zhu, and C. K. Siew, “Extreme Learning Machines: Theory and Applications,” Neurocomputing, 2006..

G() can be even more general

Page 13: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

ELM: Training

Guang-Bin Huang, et. al., “Extreme learning machine for regression and multiclass classification,” IEEE Transactions SMC-B, 2012G. B. Huang, Q. Y. Zhu, and C. K. Siew, “Extreme Learning Machines: Theory and Applications,” Neurocomputing, 2006..

Where T is training set

Where H’ is Moore Penrose Pseudoinverse (more sophisticatedLearning possible)

However can use other online Training—perceptron, lms etc

Page 14: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Intuition about Universal Approximation

G.-B. Huang, L. Chen and C.-K. Siew, “Universal Approximation Using Incremental Constructive Feedforward Networks with Random Hidden Nodes”, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006.

Page 15: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

What if you train all weights?

A. Rahimi and Recht, “Weighted Sums of Random Kitchen Sinks:…” NIPS, 2008.

• ~3.5X more random projections..• What if ~100X benefit in other aspects?

# weak learners # weak learners

% e

rror

Trai

ning

tim

e

Page 16: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projections in Image Recognition

K. Jarrett et. al, “What is the best multi-stage architecture… ,” ICCV 2009.

• Lots of architectures for deep networks – what is really important?• 3 questions:

Which nonlinearities following filters are good? (tanh, abs, max-pool, avg pool) Are unsupervised training much better than random filters? 2 stage vs 1 stage?

Page 17: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projections in Image Recognition

K. Jarrett et. al, “What is the best multi-stage architecture… ,” ICCV 2009.

• Lots of architectures for deep networks – what is really important?• 3 questions:

Which nonlinearities following filters are good? Are unsupervised training much better than random filters? 2 stage vs 1 stage?

Testing on Caltech 101

Page 18: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Random Projections in Image Recognition—Intuition

A. Saxe et. al, “On Random Weights and Unsupervised..… ,” NIPS2010.

• Random filters are also tuned to specific frequencies!

• Great option for quick architectural explorations! (size of convolution filter, stride, pooling etc)

Page 19: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

ELM in Image Processing

L. Kasun et. Al., “Representational Learning with ELMs for Big Data,” IEEE Intelligent Systems, 2013.

• Can train multiple layers by ELM auto-encoder.

784-700-700-15000-10 network

Page 20: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

ELM: Hardware Designs

Page 21: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Case study: Implantable Brain Machine Interfaces (BMI)

21BrainGate, Brown University

Page 22: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Large Scale Implantable BMI-Problems & Solutions

PROBLEMS

• Data rate/ channel ~ 200 Kbps• 1000 channels 200 Mbps• Power dissipation

UNSUSTAINABLE

22

I. Stevenson and K. Kording, “How advances in neural recording ..,” Nature Neuroscience, 2011.

SOLUTION

• Compress data• On-chip Neural decoding• E.g. Decode which finger (5

choices) moves in which direction (2 choices) 4 bits @ 1 KHz

• Spiking Neural Network (SNN) on-chip.

Page 23: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Proposal: Machine Learning Co-processor (MLCP)

Page 24: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Example Application: Decoding Dexterous Finger Movement

• Classify 12 movement types and onset time of movement—reuse H!

• Moving average* of number of spikes is input feature x.

• Notations: Input dimension: D, No. of Hidden neurons: L No. of classes: C

Recorded neural activity

Monkey moves  finger

Neuromorphic Decoder

Predicted movement

• 3 monkeys trained to perform visually cued flexion and extension of wrist & fingers

• Single unit activities recorded from M1 neurons.• Pseudorandom sequence of movement types.• Unsuccessful trials discarded from analysis.

V. Aggarwal et.al, “Asynchronous decoding of …” IEEE TNSRE, 2008.

MOTOR INTENTION DECODING

*Average over 100 ms with moving step of 20 ms

Page 25: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Algorithmic novelty: Time delay based dimension Increase (TDBDI)

• Common problem: loss of signal over time in some electrodes.

• Use extra information from previous (p-1) time samples of functional electrodes.

• For “n” electrodes, dimension D=nxp

Recorded neural activity

Monkey moves  finger

Neuromorphic Decoder

Predicted movement

• 3 monkeys trained to perform visually cued flexion and extension of wrist & fingers

• Single unit activities recorded from M1 neurons.• Pseudorandom sequence of movement types.• Unsuccessful trials discarded from analysis.

V. Aggarwal et.al, “Asynchronous decoding of …” IEEE TNSRE, 2008.

MOTOR INTENTION DECODING

Page 26: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Hardware Architecture: One Channel

spikesCounter

MovingWindowAverage

DigitalAnalogConverter

RandomProjection

WinCNT DAC CurrentMirrorArray

Page 27: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Proposed Design: Hardware Architecture of MLCP

Chen Yi, Yao Enyi and Arindam Basu, "A 128 Channel 290 GMACs/W Machine Learning..," IEEE ISCAS, 2015 27

27

• D=128, L=128• 2nd stage on DSP

Page 28: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Machine Learning Co-Processor (MLCP): Architecture

28

28

T

tU

V

new

A. Basu, S. Shuo, H. Zhou, G. Huang and M. Lim, “Silicon Spiking Neurons for Hardware Implementation of Extreme Learning Machines,” Neurocomputing. , 2013.Y. Enyi, S. Hussain, A. Basu and G. Huang, “Computation using Mismatch:…,” IEEE BioCAS. , 2013.

• D=128, L=128• 2nd stage on DSP

Page 29: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Machine Learning Co-Processor (MLCP): Architecture

29

29

T

tU

V

new

• D=128, L=128• 2nd stage on DSP

Page 30: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Results: Characterization

IC fabricated in 0.35um CMOS

Portable External Unit (PEU)

Page 31: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Results: Characterization

Page 32: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Results: Performance in Neural Decoding

32

Accuracy at par with State of the Art

Can use extra information from earlier time samples if number of M1 neurons is less!!

Power ~ 0.4uW @ 50 class/sec  8 

nJ/class !!

Recorded neural activity

Monkey moves  finger

Neuromorphic Decoder

Predicted movement

• 3 monkeys trained to perform visually cued flexion and extension of wrist & fingers

• Single unit activities recorded from M1 neurons.• Pseudorandom sequence of movement types.• Unsuccessful trials discarded from analysis.

V. Aggarwal et.al, “Asynchronous decoding of …” IEEE TNSRE, 2008.

MOTOR INTENTION DECODING

Page 33: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Results: Real Time Operation of MLCP

Chen Yi, Yao Enyi and Arindam Basu, "A 128 Channel 290 GMACs/W Machine Learning..," IEEE ISCAS, 2015 33

Page 34: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

In Telluride..

34

• Speech recognition with spikes from cohlea or MFCC.

• Word2vec

• Combined/ Fused classification of cochlea+DAVIS for lip-reading.

Page 35: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

35

Page 36: Random Projection Neural Networks: Algorithms & Hardwareneuromorphs.net/nm/raw-attachment/wiki/2015/scc15/random... · Random Projection Neural Networks: Algorithms & ... • Random

Results: Performance in Neural Decoding

Power ~ 0.4uW @ 50 class/sec  8 

nJ/class !!

Recorded neural activity

Monkey moves  finger

Neuromorphic Decoder

Predicted movement

• 3 monkeys trained to perform visually cued flexion and extension of wrist & fingers

• Single unit activities recorded from M1 neurons.• Pseudorandom sequence of movement types.• Unsuccessful trials discarded from analysis.

V. Aggarwal et.al, “Asynchronous decoding of …” IEEE TNSRE, 2008.

MOTOR INTENTION DECODING