real-time machine learning with node.js - philipp burckhardt, carnegie mellon university

23
11/14/2016 Machine Learning http://localhost:3000/#/?export&_k=lv9fld 1/23

Upload: nodejsfoundation

Post on 11-Jan-2017

94 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 1/23

Page 2: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 2/23

REAL-TIME MACHINELEARNING WITH NODE.JS

PHILIPP BURCKHARDTCarnegie Mellon University

Page 3: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 3/23

LEARNINGPATTERNSFROM DATA(iStock)

Page 4: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 4/23

REAL-TIME MACHINELEARNING WITH

NODE.JS

Page 5: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 5/23

BATCHBuild model using a batch of available data

INCREMENTALUpdate model as new data comes in

TRAINING ALGORITHMS

Page 6: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 6/23

8. // For each simulated datum, update the mean... 9. for ( var i = 0; i < 100; i++ ) { 10. var v = randu() * 100.0; 11. accumulator( v ); 12. } 13. var mean = accumulator(); 14. 15.

4. var incrmean = require( '@stdlib/math/generics/statistics/incrmean' ); 5. 6. var accumulator = incrmean(); 7.

16. 17. Update estimator as new data comes in...

Page 7: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 7/23

Prediction is very di�cult,especially if it's about thefuture.

- Nils Bohr

Page 8: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 8/23

INDEPENDENTLY ANDIDENTICALLY

DISTRIBUTED (I.I.D.)

DATA ASSUMED TO BE

Might not hold: e.g., time series are mostly non-stationary

Page 9: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 9/23

8. setInterval( function() { 9. var mem = os.freemem() / os.totalmem(); 10. accumulator( mem ); 11. var mean = accumulator(); 12. }, 1000 ); 13. 14. 15. 16. 17.

1. 'use strict'; 2. 3. var incrmmean = require( '@stdlib/math/generics/statistics/incrmmean' ); 4. var os = require( 'os' ); 5. 6. var accumulator = incrmmean( 5 ); 7.

Update moving mean as data comes in...

Page 10: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 10/23

Moving Means

window size

Page 11: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 11/23

Page 12: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 12/23

TYPES OF PROBLEMSRegression

0 20 40 60 80 100­1,000

­500

0

500

1,000

1,500

e.g., house prices

Classi�cation

0 20 40 60 80 1000

20

40

60

80

100

e.g., character recognition (OCR)

Clustering

0 20 40 60 80 1000

20

40

60

80

100

e.g., movie tastes

Page 13: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 13/23

REGRESSIONModel relationship between a numeric dependent

variable y and one or more explanatory variables X.

Page 14: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 14/23

14. registry 15. .on( 'package', function onPkg( pkg ) { 16. var nVersions = pkg.versions ? 17. pkg.versions.length : 0; 18. if ( pkg.created ) { 19. var current = new Date().getTime(); 20. var created = new Date( pkg.created ); 21. var age = ( current - created ) / 22. ( 1000 * 60 * 60 * 24 * 365 ); 23. model.update( [ age ], nVersions ); 24. } 25. })

10. 'loss': 'huber', 11. 'intercept': true 12. }); 13.

26. 27. 28. 29. 30. 31.

Use creation date to predict # of versions

Page 15: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 15/23

Start Regression line: = 0.000 + 0.000xNumber of package versions is positively correlated with age:

y

Page 16: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 16/23

CLASSIFICATIONModel relationship between a dependent categorical

variable y and one or more explanatory variables X.

Page 17: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 17/23

Predicting a binary outcome

8. var model = onlineClassification({ 9. 'lambda': 1e-6, 10. 'intercept': true, 11. 'loss': 'log' 12. }); 13. 14. registry.on( 'package', function onPkg( pkg ) { 15. var usesReact = pkg.mentions( 'react' ) ? 16. 1 : 17. -1; 18. 19. var features = [ 20. 'webpack', 'browserify', 'jest', 21. 'tape', 'mocha' 22. ].map( 23. d => pkg.devDependsOn( d ) ); 24. 25. var phat = model.predict( features, 'probability' ); 26. var yhat = phat > 0.5 ? +1 : -1; 27.

Page 18: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 18/23

React Usageparadigm-tagsmarketeercouchcachedatamodel-to-openapieaze-requestparadigm-categoriesjoeljparks-hubot-cosmicjrparadigm-taxonomiesgulp-controlled-merge-jsonember-cli-addon-tests

PredictedYes

PredictedNo

Yes 0 4No 0 34

Webpack Browserify Jest Tape Mocha

­2.0

­1.0

1.0

2.0

Page 19: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 19/23

Evaluating regression and classi�cationmodels

500 1,000 1,500 2,000 2,500

0.10

0.20

0.30

0.40

0.50

15%

Look at generalizationerror (performance on data notused for model training)

Our toy model does not doso well: A mis-classi�cation rateof 13% might sound great, butalways predicting -1 yields 15%!

Page 20: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 20/23

CLUSTERINGGroup observations into meaningful clusters such that

objects within a cluster are similar to each other and di�erentfrom objects assigned to the other clusters.

Page 21: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 21/23

POPULAR ALGORITHMSkmeansdbscanHierarchical ClusteringMixture of Gaussians

4 5 6 7 8Dim. 1

2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

4

4.1

4.2

4.3

4.4Dim. 2

1

2

3

4

5

6

7Dim. 3

4 5 6 7 8 Dim. 1

2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

4

4.1

4.2

4.3

4.4Dim. 2

1

2

3

4

5

6

7Dim. 3

Cluster 1

Cluster 2

Cluster 3

Iris setosa

Iris versicolor

Iris virginica

Iris Speciesk­Means Clusters

Page 22: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 22/23

Free Textbooks:"An Introduction to Statistical Learning" by James, Witten,

Hastie & Tibshirani (plus accompanying video lectures)"Elements of Statistical Learning: Data Mining, Inference,

and Prediction." by Hastie, Tibshirani & Friedmanstdlib GitHub repository: https://github.com/stdlib-

js/stdlib/tree/develop

FURTHER RESOURCES

Page 23: Real-Time Machine Learning with Node.js - Philipp Burckhardt, Carnegie Mellon University

11/14/2016 Machine Learning

http://localhost:3000/#/?export&_k=lv9fld 23/23

THANK YOU!