david willingham application engineer - mathworks...20 challenges matlab solution time (loss of...

22
1 © 2014 The MathWorks, Inc. Machine Learning with MATLAB David Willingham Application Engineer

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

1© 2014 The MathWorks, Inc.

Machine Learning with MATLAB

David Willingham

Application Engineer

Page 2: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

2

Goals

Overview of machine learning

Machine learning models & techniques available in

MATLAB

Streamlining the machine learning workflow with

MATLAB

Page 3: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

3

Motivation

Do you want to create a model of a system?

– Understand dynamics

– Predict Outputs

How do you create a model?

– Develop an equation

Takes time to develop, sometimes even years

Unknown if there is actually an equation at all

– Another option, Machine Learning

Page 4: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

4

Used Across Many Application Areas

Biology Financial

Services

Image & Video

ProcessingEnergy

Pattern

Recognition

Credit Scoring,

Algorithm

Trading, Bond

Classification

Load, Price

Forecasting,

Trading

Tumor Detection,

Drug Discovery

Audio

Processing

Speech

Recognition

Page 5: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

6

Machine LearningCharacteristics and Examples

Characteristics

– Lots of data (many variables)

– System too complex to know

the governing equation(e.g., black-box modeling)

Examples

– Pattern recognition (speech, images)

– Financial algorithms (credit scoring, algo trading)

– Energy forecasting (load, price)

– Biology (tumor detection, drug discovery)

93.68%

2.44%

0.14%

0.03%

0.03%

0.00%

0.00%

0.00%

5.55%

92.60%

4.18%

0.23%

0.12%

0.00%

0.00%

0.00%

0.59%

4.03%

91.02%

7.49%

0.73%

0.11%

0.00%

0.00%

0.18%

0.73%

3.90%

87.86%

8.27%

0.82%

0.37%

0.00%

0.00%

0.15%

0.60%

3.78%

86.74%

9.64%

1.84%

0.00%

0.00%

0.00%

0.08%

0.39%

3.28%

85.37%

6.24%

0.00%

0.00%

0.00%

0.00%

0.06%

0.18%

2.41%

81.88%

0.00%

0.00%

0.06%

0.08%

0.16%

0.64%

1.64%

9.67%

100.00%

AAA AA A BBB BB B CCC D

AAA

AA

A

BBB

BB

B

CCC

D

Page 6: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

7

Challenges – Machine Learning

Significant technical expertise required

No “one size fits all” solution

Locked into Black Box solutions

Time required to conduct the analysis

Page 7: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

8

Overview – Machine Learning

Machine

Learning

Supervised

Learning

Classification

Regression

Unsupervised

LearningClustering

Group and interpretdata based only

on input data

Develop predictivemodel based on bothinput and output data

Type of Learning Categories of Algorithms

Page 8: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

9

Unsupervised Learning

Clustering

k-Means,

Fuzzy C-Means

Hierarchical

Neural

Networks

Gaussian

Mixture

Hidden Markov

Model

Page 9: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

10

Supervised Learning

Regression

Non-linear Reg.

(GLM, Logistic)

Linear

RegressionDecision Trees

Ensemble

Methods

Neural

Networks

Classification

Nearest

Neighbor

Discriminant

AnalysisNaive Bayes

Support Vector

Machines

Page 10: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

11

Supervised Learning - Workflow

Known data

Known responses

Model

Train the Model

Model

New Data

Predicted

Responses

Use for Prediction

Measure Accuracy

Select Model

Import Data

Explore Data

Data

Prepare Data

Speed up Computations

Page 11: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

12

ClassificationOverview

What is classification?

– Predicting the best group for each point

– “Learns” from labeled observations

– Uses input features

Why use classification?

– Accurately group data never seen before

How is classification done?

– Can use several algorithms to build a predictive model

– Good training data is critical

-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Group1

Group2

Group3

Group4

Group5

Group6

Group7

Group8

Page 12: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

15

ClusteringOverview

What is clustering?

– Segment data into groups,

based on data similarity

Why use clustering?

– Identify outliers

– Resulting groups may be

the matter of interest

How is clustering done?

– Can be achieved by various algorithms

– It is an iterative process (involving trial and error)

-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 13: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

18

MATLABDesktop

End-UserMachine

1

2

3Toolboxes

Deploying MATLAB Applications to Excel

MATLABBuilder EX

MATLAB Compiler

.dll .bas

Page 14: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

19

Deployment Highlights

Java

Excel

.NET

.exe

C

CTF

Royalty-free deployment

Point-and-click workflow

Unified process for desktop and server apps

Application Servers

Database Servers

Web Applications

Client Front End

Applications

SpreadsheetsHADOOP

Desktop Applications

Batch/Cron Jobs

Page 15: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

20

Challenges MATLAB Solution

Time (loss of productivity) Rapid analysis and application developmentHigh productivity from data preparation, interactive

exploration, visualizations.

Extract value from data Machine learning, Video, Image, and FinancialDepth and breadth of algorithms in classification, clustering,

and regression

Computation speed Fast training and computationParallel computation, Optimized libraries

Time to deploy & integrate Ease of deployment and leveraging enterprisePush-button deployment into production

Technology risk High-quality libraries and supportIndustry-standard algorithms in use in production

Access to support, training and advisory services when

needed

MATLAB for Machine Learning

Page 16: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

21

Machine Learning with MATLAB

Interactive environment

– Visual tools for exploratory data analysis

– Easy to evaluate and choose best algorithm

– Apps available to help you get started(e.g,. neural network tool, curve fitting tool)

Multiple algorithms to choose from

– Clustering

– Classification

– Regression

Page 17: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

22

Learn More : Machine Learning with MATLAB

Classification

with MATLAB

Credit Risk Modeling with

MATLAB

Multivariate Classification in

the Life Sciences

Electricity Load and

Price Forecasting

http://www.mathworks.com/discovery/

machine-learning.html

Data Driven Fitting

with MATLABRegression

with MATLAB

Page 18: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

23

Training ServicesExploit the full potential of MathWorks products

Flexible delivery options:

Public training available worldwide

Onsite training with standard or

customized courses

Web-based training with live, interactive

instructor-led courses

Self-paced interactive online training

More than 30 course offerings:

Introductory and intermediate training on MATLAB, Simulink,

Stateflow, code generation, and Polyspace products

Specialized courses in control design, signal processing, parallel computing,

code generation, communications, financial analysis,

and other areas

Page 19: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

24

Migration Planning

Component Deployment

Full Application Deployment

Co

nti

nu

ou

s Im

pro

ve

me

nt

Consulting ServicesAccelerating return on investment

A global team of experts supporting every stage of tool and process integration

Supplier InvolvementProduct Engineering TeamsAdvanced EngineeringResearch

Advisory Services

Process Assessment

Jumpstart

Process and Technology

Standardization

Process and Technology

Automation

Page 20: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

25

Technical Support

Resources

Over 100 support engineers

– All with MS degrees (EE, ME, CS)

– Local support in North America,

Europe, and Asia

Comprehensive, product-specific Web

support resources

High customer satisfaction

95% of calls answered

within three minutes

70% of issues resolved

within 24 hours

80% of customers surveyed

rate satisfaction at 80–100%

Page 21: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

26

MATLAB Central

Community for MATLAB and Simulink

users

Over 1 million visits per month

File Exchange– Upload/download access to free files

including MATLAB code, Simulink models,

and documents

– Ability to rate files, comment, and ask questions

– More than 12,500 contributed files, 300

submissions per month, 50,000 downloads

per month

Newsgroup– Web forum for technical discussions about

MathWorks products

– More than 300 posts per day

Blogs– Commentary from engineers who design, build,

and support MathWorks products

– Open conversation at blogs.mathworks.com

Based on February 2011 data

Page 22: David Willingham Application Engineer - MathWorks...20 Challenges MATLAB Solution Time (loss of productivity) Rapid analysis and application development High productivity from data

27

Questions?