david willingham application engineer - mathworks...20 challenges matlab solution time (loss of...
TRANSCRIPT
1© 2014 The MathWorks, Inc.
Machine Learning with MATLAB
David Willingham
Application Engineer
2
Goals
Overview of machine learning
Machine learning models & techniques available in
MATLAB
Streamlining the machine learning workflow with
MATLAB
3
Motivation
Do you want to create a model of a system?
– Understand dynamics
– Predict Outputs
How do you create a model?
– Develop an equation
Takes time to develop, sometimes even years
Unknown if there is actually an equation at all
– Another option, Machine Learning
4
Used Across Many Application Areas
Biology Financial
Services
Image & Video
ProcessingEnergy
Pattern
Recognition
Credit Scoring,
Algorithm
Trading, Bond
Classification
Load, Price
Forecasting,
Trading
Tumor Detection,
Drug Discovery
Audio
Processing
Speech
Recognition
6
Machine LearningCharacteristics and Examples
Characteristics
– Lots of data (many variables)
– System too complex to know
the governing equation(e.g., black-box modeling)
Examples
– Pattern recognition (speech, images)
– Financial algorithms (credit scoring, algo trading)
– Energy forecasting (load, price)
– Biology (tumor detection, drug discovery)
93.68%
2.44%
0.14%
0.03%
0.03%
0.00%
0.00%
0.00%
5.55%
92.60%
4.18%
0.23%
0.12%
0.00%
0.00%
0.00%
0.59%
4.03%
91.02%
7.49%
0.73%
0.11%
0.00%
0.00%
0.18%
0.73%
3.90%
87.86%
8.27%
0.82%
0.37%
0.00%
0.00%
0.15%
0.60%
3.78%
86.74%
9.64%
1.84%
0.00%
0.00%
0.00%
0.08%
0.39%
3.28%
85.37%
6.24%
0.00%
0.00%
0.00%
0.00%
0.06%
0.18%
2.41%
81.88%
0.00%
0.00%
0.06%
0.08%
0.16%
0.64%
1.64%
9.67%
100.00%
AAA AA A BBB BB B CCC D
AAA
AA
A
BBB
BB
B
CCC
D
7
Challenges – Machine Learning
Significant technical expertise required
No “one size fits all” solution
Locked into Black Box solutions
Time required to conduct the analysis
8
Overview – Machine Learning
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Group and interpretdata based only
on input data
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
9
Unsupervised Learning
Clustering
k-Means,
Fuzzy C-Means
Hierarchical
Neural
Networks
Gaussian
Mixture
Hidden Markov
Model
10
Supervised Learning
Regression
Non-linear Reg.
(GLM, Logistic)
Linear
RegressionDecision Trees
Ensemble
Methods
Neural
Networks
Classification
Nearest
Neighbor
Discriminant
AnalysisNaive Bayes
Support Vector
Machines
11
Supervised Learning - Workflow
Known data
Known responses
Model
Train the Model
Model
New Data
Predicted
Responses
Use for Prediction
Measure Accuracy
Select Model
Import Data
Explore Data
Data
Prepare Data
Speed up Computations
12
ClassificationOverview
What is classification?
– Predicting the best group for each point
– “Learns” from labeled observations
– Uses input features
Why use classification?
– Accurately group data never seen before
How is classification done?
– Can use several algorithms to build a predictive model
– Good training data is critical
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Group1
Group2
Group3
Group4
Group5
Group6
Group7
Group8
15
ClusteringOverview
What is clustering?
– Segment data into groups,
based on data similarity
Why use clustering?
– Identify outliers
– Resulting groups may be
the matter of interest
How is clustering done?
– Can be achieved by various algorithms
– It is an iterative process (involving trial and error)
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
18
MATLABDesktop
End-UserMachine
1
2
3Toolboxes
Deploying MATLAB Applications to Excel
MATLABBuilder EX
MATLAB Compiler
.dll .bas
19
Deployment Highlights
Java
Excel
.NET
.exe
C
CTF
Royalty-free deployment
Point-and-click workflow
Unified process for desktop and server apps
Application Servers
Database Servers
Web Applications
Client Front End
Applications
SpreadsheetsHADOOP
Desktop Applications
Batch/Cron Jobs
20
Challenges MATLAB Solution
Time (loss of productivity) Rapid analysis and application developmentHigh productivity from data preparation, interactive
exploration, visualizations.
Extract value from data Machine learning, Video, Image, and FinancialDepth and breadth of algorithms in classification, clustering,
and regression
Computation speed Fast training and computationParallel computation, Optimized libraries
Time to deploy & integrate Ease of deployment and leveraging enterprisePush-button deployment into production
Technology risk High-quality libraries and supportIndustry-standard algorithms in use in production
Access to support, training and advisory services when
needed
MATLAB for Machine Learning
21
Machine Learning with MATLAB
Interactive environment
– Visual tools for exploratory data analysis
– Easy to evaluate and choose best algorithm
– Apps available to help you get started(e.g,. neural network tool, curve fitting tool)
Multiple algorithms to choose from
– Clustering
– Classification
– Regression
22
Learn More : Machine Learning with MATLAB
Classification
with MATLAB
Credit Risk Modeling with
MATLAB
Multivariate Classification in
the Life Sciences
Electricity Load and
Price Forecasting
http://www.mathworks.com/discovery/
machine-learning.html
Data Driven Fitting
with MATLABRegression
with MATLAB
23
Training ServicesExploit the full potential of MathWorks products
Flexible delivery options:
Public training available worldwide
Onsite training with standard or
customized courses
Web-based training with live, interactive
instructor-led courses
Self-paced interactive online training
More than 30 course offerings:
Introductory and intermediate training on MATLAB, Simulink,
Stateflow, code generation, and Polyspace products
Specialized courses in control design, signal processing, parallel computing,
code generation, communications, financial analysis,
and other areas
24
Migration Planning
Component Deployment
Full Application Deployment
Co
nti
nu
ou
s Im
pro
ve
me
nt
Consulting ServicesAccelerating return on investment
A global team of experts supporting every stage of tool and process integration
Supplier InvolvementProduct Engineering TeamsAdvanced EngineeringResearch
Advisory Services
Process Assessment
Jumpstart
Process and Technology
Standardization
Process and Technology
Automation
25
Technical Support
Resources
Over 100 support engineers
– All with MS degrees (EE, ME, CS)
– Local support in North America,
Europe, and Asia
Comprehensive, product-specific Web
support resources
High customer satisfaction
95% of calls answered
within three minutes
70% of issues resolved
within 24 hours
80% of customers surveyed
rate satisfaction at 80–100%
26
MATLAB Central
Community for MATLAB and Simulink
users
Over 1 million visits per month
File Exchange– Upload/download access to free files
including MATLAB code, Simulink models,
and documents
– Ability to rate files, comment, and ask questions
– More than 12,500 contributed files, 300
submissions per month, 50,000 downloads
per month
Newsgroup– Web forum for technical discussions about
MathWorks products
– More than 300 posts per day
Blogs– Commentary from engineers who design, build,
and support MathWorks products
– Open conversation at blogs.mathworks.com
Based on February 2011 data
27
Questions?